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^ tens einer Verbindung - bevorzugt einein DNA-Konstrukt - be^higl zur Verminderung der Expression, Menge, Aktivitat und/oder 
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INVERSION OF THE NEGATIVE-SELECTIVE EFFECT OF NEGATIVE 
MARKER PROTEINS USING SELECTION METHODS 



Description 

The present invention relates to processes for preparing trans- 
formed plant cells or organisms by transforming a population of 
plant cells which comprises at least one marker protein having a 
direct or indirect toxic effect for said population, with at 
least one nucleic acid sequence to be inserted in combination 
10 with at least one compound, preferably a DNA construct, capable 
of reducing the expression, amount, activity and/or function of 
the marker protein, with the transformed plant cells having a 
growth advantage over nontrans formed cells, due to the action of 
said compound, 

material is successfully introduced usually only into a 
very limited number of target cells of a population. This neces- 
sitates the distinction and isolation of successfully transformed 
from nontrans formed cells, a process which is referred to as 
selection. Traditionally, the selection is carried out by way of 
a ^'positive" selection, wherein the transformed cell is enabled 
to grow and to survive, whereas the untransf ormed cell is inhib- 
ited in its growth or destroyed (McCormick et al. (1986) Plant 
20 Cell Reports 5:81-84). A positive selection of this kind is usu- 
ally implemented by genes which code for a resistance to a bio- 
cide (e.g- a herbicide such as phosphinothricin, glyphosate or 
bromoxynil, a metabolism inhibitor such as 2 -deoxy glucose 6^phos- 
phate (WO 98/45456) or an antibiotic such as tetracycline, ampi- 
cillin, kanamycin, G 418, neomycin, bleomycin or hygromycin) . 
Such genes are also referred to as positive selection markers. 
The positive selection marker is coupled (physically or by means 
of cotrans formation) to the nucleic acid sequence to be 
introduced into the cell genome and is then introduced into the 
cell, subsequently, the cells are cultured on a medium under the 
appropriate selection pressure (for example in the presence of an 
appropriate antibiotic or herbicide), whereby the transformed 
cells, owing to the required resistance to said selection pres- 
30 sure, have a growth/ survival advantage and can thus be selected, 
positive selection markers which may be mentioned by way of exam- 
ple are: 
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phosphinothrlcin acetyltransferases (PAT) (also: Bialophos® 
resistance; bar) acetylate the free amino group of the gluta- 
mine synthase inhibitor phosphinothricin (PPT) and thus 
achieve a detoxification (de Block et al. (1987) EMBO J 
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6:2513-2518; Vickers JE et al. (1996) Plant Mol Biol Reporter 
14:363-368? Thompson CJ et al- (1987) EMBO J 6:2519-2523). 



5-enolpyruvylshikiiiiate 3-phosphate synthases (EPSPS) impart a 
^ resistance to the unselective herbicide Glyphosat® (N'(phos- 

phonomethyl) glycine? Steinrucken HC et al. (1980) Biochem 
Biophys Res Commun 94:1207—1212? Levin JG and Sprinson DB 
(1964) J Biol Chem 239:1142-1150? Cole DJ (1985) Mode of ac- 
tion of glyphosate? A literature analysis, p. 48— 74* In? 
^® Grossbard E and Atkinson D (eds.) The herbicide glyphosate. 

Buttersworths , Boston.)- Glyphosate-tolerant EPSPS variants 
for use as selection markers have been described (Padgetts SR 
et al- (1996). New weed control opportunities: development of 
soybeans with a Roundup Ready ~ gene* In: Herbicide Resistant 
Crops (Duke SO, ed- ) , pp- 53-84. CRC Press, Boca Raton, FL? 
Saroha MK and Malik VS (1998) J Plant Biochemistry and Bio- 
technology 7:65-72? Padgette SR et al-(1995) Crop Science 
35(5):1451-1461? US 5,510,471; US 5,776,760; US 5,864,425? US 
5,633,435? US 5,627,061; US 5,463,175? EP-A 0 218 571)- 



20 



25 



30 



neomycin phosphotransferases constantly impart a resistance 
to aminoglycoside antibiotics such as neomycin, G418, hygro- 
mycin, paromomycin or kanamycin by reducing the inhibiting 
action thereof by means of a phosphorylation reaction (Beck 
et al- (1982) Gene 19i327-336). 

2-deoxyglucose 6-phosphate phosphatases impart a resistance 
to 2-deoxyglucose (EP-A 0 807 836; Randez-Gil et al. (1995) 
Yeast 11:1233-1240; Sanz et al . (1994) Yeast 10:1195-1202). 



acetolactate synthases impart a resistance to imidazolinone/ 
sulfonylurea herbicides (e.g. imazzamox, imazapyr, imazaquin, 
imazethapyr, amidosulf oron, azimsulfuron, chlorimuron ethyl, 
35 chlorsulfuron? Sathasivan K et al, (1990) Hucleic Acids Res 

18(8) :2188) . 



In addition, resistance genes to the antibiotics hygromycin (hy- 
gromycin phosphotransferases), chloramphenicol (chloramphenicol 
acetyltransferase) , tetracycline, streptomycin, zeocine and ampi- 
cillin (^-lactamase gene; Datta N, Richmond MH,(1966) Biochem J 
98(l):204-9) have been described. 



Genes such as isopentenyl transferase (ipt) from Agrobacterium 
tumefaciens (strains P022 ) (GenBank Acc. No.: AB025109) may like- 
wise be used as selection markers. The ipt gene is a key enzyme 
of cytokine biosynthesis. Its overexpression facilitates the re- 
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generation of plants (e.g* selection on cytokine~f ree medixam) 
(Ebinuma H et ai. (2000) Proc Natl Acad Sci USA 94:2117-2121; 
Ebinuma H et al. (2000) Selection of Marker-free transgenic 
plants using the oncogenes [iptr rol h, B, C) of Agrobacteriuzn as 
5 selectable markers. In Molecular Biology of Woody Plants. Kluwer 
Academic Publishers). The disadvantages here are, firstly, the 
fact that the selection disadvantage is based on usually subtle 
differences in cell proliferation and, secondly, the fact that 
the plant acquires unwanted properties (gall tumor formation) due 
10 transformation with an oncogene. 

EP-A 0 601 092 describes various other positive selection mark- 
ers. Examples which may be mentioned are: ^-glucuronidase (in con- 
nection with, for example r cytokinine glucuronide) , mannose 
15 6-phosphate isomerase (in connection with mannose), UDP-galactose 
4-epimerase (in connection with galactose, for example). 



Negative selection markers are used for selecting organisms in 
which marker sequences have been successfully deleted (Koprek T 
et al. (1999) Plant J 19 ( 6 ) s 719-726 ) . In the presence of a nega- 
tive selection marker, the corresponding cell is destroyed or ex- 
periences a growth disadvantage. Negative selection involves, for 
example, the negative selection marker introduced into the plant 
2g converting a compound which otherwise has no action disadvanta- 
geous to the plant into a compound with a disadvantageous (i.e. 
toxic) action- Examples of negative selection markers include: 
thymidine kinase (TK), for example of Herpes simplex virus (Wig- 
ler et al. (1977) Cell 11:223), cellular adenine phosphoribosyl 
transferase (APRT) (Wigler et al. (1979) Proc Natl Acad Sci USA 
76:1373), hypoxanthine phosphoribosyl transferase (HPRT) (orolly 
et al. (1983) Proc Natl Acad Sci USA 80:477), diphtheria toxin A 
fragment (DT-A), the bacterial xanthine-guanine 

phosphoribosyl transferase (gpt; Besnard et al. (1987) Mol. Cell- 
Biol. 7:4139; Mzoz and Moolten (1993) Human Gene Therapy 
4:589-595), the codA gene product coding for a cytosine deaminase 
(Gleave AP et al. (1999) Plant Mol Biol. 4 0 (2 ): 223-35 ; Perera RJ 
et al. (1993) Plant Mol Biol 23(4): 793-799? Stougaard J; (1993) 
Plant J 3:755-761; EP-Al 595 873), the cytochrome P450 gene (Ko- 
prek et al. (1999) Plant J 16:719-726), genes coding for a ha- 
loalkane dehalogenase (Naested H (1999) Plant J 18:571-576), the 
iaaH gene (Sundaresan V et al. (1995) Genes & Development 
9:1797-1810) or the tms2 gene (Fedoroff NV & Smith DL (1993) 
Plant J 3: 273-289). The negative selection markers are usually 
employed in combination with "prodrugs" or "pro-toxins", com- 
pounds which are converted into toxins by the activity of the 
selection marker. 
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5-Methylthioribose (MTR) kinase is an enzyme whose enzymic activ- 
ity in plants, bacteria and protozoa, but not in manunals, has 
been described. The enzyme may convert an MTR analog (5- 
( triromethyljthioribose) as a "subversive substrate" of the me- 
5 thionine salvage pathway via an unstable intermediate to give the 
toxic compound carbothionyl difluoride. 



Said selection systems have various disadvantages. The introduced 
selection marker (e.g. resistance to antibiotics) is justified 
only during transformation and selection but is later a usually 
unnecessary and often also undeslred protein product. This may be 
disadvantageous for reasons of consumer acceptance and/ or approv- 
al as a food and/or feed product. Another disadvantage in this 
connection is the fact that the selection marker used for selec- 
tion is usually genetically coupled to the nucleic acid sequence 
to be inserted into the genome and cannot be decoupled by segre- 
gation during propagation or crossing. Usually, deletion of the 
marker sequence is required, making additional steps necessary. 
In addition, biotechnological studies require in numerous cases 
multiple transformation with various gene constructs. Here, each 
transformation step requires a new selection marker unless the 
previously used marker is to be laboriously deleted first. This, 
however, necessitates a broad palette of well-functioning selec- 
tion markers which are not available for most plant organisms. 



25 



Consequently, it was the object of the invention to provide novel 
selection processes for selecting transformed plant cells and or- 
ganisms, which, if possible, no longer have the disadvantages of 
the available systems. This object is achieved by the present in- 



30 

vention 



The invention firstly relates to a process for preparing trans- 
formed plant cells or organisms, which process comprises the fol- 
35 lowing steps t 



a) transforming a population of plant cells, with the cells 
of said population containing at least one marker protein 
capable of causing directly or indirectly a toxic effect 
for said population, with at least one nucleic acid se- 
quence to be inserted in combination with at least one 
compound capable of reducing the expression, amount, ac- 
tivity and/or function of at least one marker protein, 
and 

b) selecting transformed plant cells whose genome contains 

said nucleic acid sequence and which have a growth advan- 
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t-age over non-transformed cells, due to the action of said 
compound, from said population of plant cells, the selec- 
tion being carried out under conditions under which the 
marker protein can exert its toxic effect on the non- 
5 transformed cells. 



In a preferred embodiment, the marker protein is a protein capa* 
ble of converting directly or indirectly a substance X which is 
nontoxic for said population of plant cells into a substance Y 
which is toxic for said population. In this case, the process of 
the invention preferably comprises the following steps: 



a) transforming the population of plant cells with at least 
one nucleic acid sequence to be inserted in combination 
with at least one compound capable of reducing the ex- 
pression, amount, activity and/or function of at least 
one marker protein, and 



20 b) treating said population of plant cells with the sub- 

stance X at a concentration which causes a toxic effect 
for nontransf ormed cells, due to the conversion by the 
marker protein, and 

25 c) selecting transformed plant cells whose genome contains 

said inserted nucleic acid sequence and which have a 
growth advantage over nontransf ormed cells, due to the 
action of said compound, from said population of plant 
cells, the selection being carried out under conditions 

30 under which the marker protein can exert its toxic effect 

on the nontrans formed cells. 



The nontoxic substance X is preferably a substance which does not 
naturally occur in plant cells or organisms or occurs naturally 
therein only at a concentration which can essentially not cause 
any toxic effect. In the scope of the process of the invention, 
preference is given to applying the nontoxic substance X exoge- 
nously, for example via the medium or the growth substrate. 

The term "compound capable of reducing the expression, amount, 
activity and/or function of at least one marker protein" is to be 
understood broadly and generally means any compounds which cause, 
directly or indirectly, alone or in cooperation with other fac- 
tors, a reduction in the amount of protein, amount of RNA, gene 
activity, protein activity or protein function of at least one 
marker protein. Said compounds are also referred to under the ge- 
neric term "anti-marker protein" compounds • The term "anti-marker 
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protein" compound includes in particular , but is not limited to^ 
the nucleic acid sequences, ribonucleic acid sequences « double- 
stranded ribonucleic acid sequences^ antisense ribonucleic acid 
sequences, expression cassettes ^ peptides, proteins or other fac- 
5 tors used in the preferred embodiments within the scope of the 
process of the invention - 



In a preferred embodiment, "anti-marker protein" compound means a 
PNA construct comprising 

10 

a) at least one expression cassette suitable for expressing 
a ribonucleic acid sequence and/or, if appropriate, a 
protein, said nucleic acid sequence and/or protein being 
capable of reducing the expression, amount., activity and/ 
or function of the marker protein, or 



b) at least one sequence which causes a partial or complete 
deletion or inversion of the sequence coding for said 

20 marker protein and thus enables the expression, amount, 

activity and/or function of the marker protein to be re- 
duced, and also. If appropriate, further funct-ional ele- 
ments which facilitate and/or promote said deletion or 
inversion, or 

25 

c) at least one sequence which causes an insertion into the 
sequence coding for said marker protein and thus enables 
the expression, amount, activity and/or function of the 
marker protein to be reduced, and also, if appropriate, 

30 further functional elements which facilitate and/or pro- 

mote said Insertion. 



The process of the Invention stops the negative-selective action 
of the marker protein. To this extent, an "anti-marker protein" 
compound acts directly (e.g. via inactivation by means of inser- 
tion into the gene coding for the marker protein) or Indirectly 
(e.g. by means of the ribonucleic acid sequence expressed via the 
expression cassette and/or, where appropriate, of the protein 
translated therefrom) as a positive selection marker. Hence, the 
selection system of the invention is to be referred to as a "re- 
verse selection system", since It "reverts" the negative-selec- 
tive action of the marker protein. 

The process of the Invention means a drastic broadening of the 
repertoire of positive selection processes for selecting trans- 
formed plant cells. 
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Another advantage is the fact that in a particular, preferred em- 
bodiment (e.g. via the action of a double-stranded or antisense 
RNA) f it is possible to implement the selection effect without 
expressing a foreign protein (see below). 

It is also advantageous that the marker protein used indirectly 
for selection (e.g. the negative selection marker) is not coupled 
genetically to the nucleic acid sequence to be inserted into the 
genome. In contrast to the otherwise customary selection pro- 
cesses, the marker protein, if it is a transgene, may be removed 
by simple segregation in the course of subsequent propagation or 
crossing. 



15 "Plant cell" means within the scope of the present invention any 
type of cell which has been derived from a plant organism or is 
present therein. In this context, the term includes by way of ex— 
ample protoplasts, callus or cell cultures, microspores, pollen, 
cells in the form of tissues such as leaves, meristem, flowers, 

20 embryos, roots, etc. Included are, in particular, all of those 
cells and cell populations which are suitable as target tissues 
for a transformation. 



In this context, "plant organism" comprises any organism capable 
25 of photosynthesis and also the cells, tissues, parts or propaga- 
tion material (such as seeds or fruits) derived therefrom. In- 
cluded within the scope of the invention are all genera and spe- 
cies of higher and lover plants of the plant kingdom- Preference 
is given to annual, perennial, monocotyledonous and dicotyledon- 
ous plants and also gymnosperms. 



"Plant" means within the scope of the invention all genera and 
species of higher and lower plants of the plant kingdom. The term 
includes the mature plants, seed, shoots and seedlings, and also 
parts, propagation material (for example tubers, seeds or 
fruits), plant organs, tissues, protoplasts, callus and other 
cultures, for example cell cultures, derived therefrom, and also 
any other types of groupings of plant cells to give functional or 
structural units- Mature plants means plants at any developmental 
stage beyond that of the seedling. Seedling means a young imma- 
ture plant at an early developmental stage. "Plant" comprises all 
annual and perennial monocotyledonous and dicotyledonous plants 
and includes by way of example but not by limitation those of the 
genera Cucurbita, Rosa, Vitis, Juglans, Fragaria, X<otus, Medica- 
go, Onobrychis, Trifoliuro, Trigonella, Vigna, Citrus, Linum, Ge- 
ranium, Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sina- 
pis, Atropa, Capsicum, Datura, Hyoscyamus, Lycopersicon, 
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Nicotiana, Solarium, Petunia/ Digitalis, Majorana, Cichorium, He- 
lianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Heterocallis, 
Nemesis, Pelargonium, Panieum, Pennisetum, Ranunculus, Senecio, 
Salpiglossis, Cucumis, Browaalia, Glycine, Pisum, Phaseolus, Lo- 
5 lium, Oryza, Zea, Avena, Hordeum, Secale, Triticum, Sorghum, Pi- 
cea and Populus. 



Preference is given to plants of the following plant families: 
Amaranthaceae, Aster aceae, Brassicaceae, Carophyllaceae, Chenopo- 
diaceae, Compositae, Cruciferae, Cucurbitaceae, Labiatae, Legumi- 
nosae, Papilionoideae, Liliaceae, Linaceae, Malvaceae, Rosaceae, 
Rubiaceae, Saxif ragaceae, Scrophulariaceae , Solanacea, Sterculia- 
ceae, Tetragoniacea, Theaceae^ Uiobelliferae. 



15 



20 



25 



Preferred monocotyledonous plants are selected in particular from 
the monocotyledonous crop plants such as, for example, those in 
the family of Gramineae such as alfalfa, rice, corn, wheat or 
other cereal species such as barley, millet, rye, triticale or 
oats and also from sugsur cane and all grass species. 

Preferred dicotyledonous plants are selected in particular from 
the dicotyledonous crop plants such as, for example, 

- Asteraceae, such as sunflower, tagetes or calendula and others, 

- Compositae, in particular the genus Lactuca, very especially 
the species sativa (lettuce) and others. 



- Cruciferae, especially the genus Brassica, very especially the 
species napus {oilseed rape), campestris (beet), oleracea cv 
Tastie (cabbage), oleracea cv Snowball Y (cauliflower) and ol- 
eracea cv Emperor (broccoli) and other ceibbage species; and the 
genus Arabidopsis, very especially the species thaliana, and 
cress or canola and others. 



- Cucurbit aceae, such as melon, pumpkin/squash or zucchini and 
others , 

- I.eguminosae, especially the genus Glycine, very especially the 
species max (soybean) and alfalfa, pea, bean plant or peanut, 
and others 



45 



Rubiaceae, preferably the subclass Lamiidae, such as, for exeun* 
pie, Coffea arabica or Coffea liberica (coffee bush) and oth- 
ers. 
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- Solanaceae, in particular the genus Lycopersicon, very espe- 
cially the species esculentum (tomato), the genus Solanum, very 
especially the species tuberosum (potato) and melongena (egg- 
plant)/ and the genus Capsicum^ very especially the species an- 
nuum (pepper) and tobacco and others, 

- Sterculiaceae, preferably the subclass Dilleniidae, such as^ 
£or example, Theobroma cacao (cacao tree) and others, 

- Theaceae, preferably the subclass Dilleniidae, such as, for ex- 
ample. Camellia sinensis or Thea sinensis (tea shrub) and oth- 
ers, 

15 - Umbelliferae, especially the genus Daucus (very especially the 
species carota (carrot) ) and Apium (very especially the species 
graveolens dulce (celery)) and others, 

and linseed, cotton, hemp, flax, cucumber, spinach, carrot, sugar 
2^ beet and the various tree, nut and grapevine species, in particu- 
lar banana and kiwi. 



25 



Plant organisms for the purposes of the invention are furthermore 
other photosynthetically active capable organisms such as, for 
example, algae, cyanobacteria and mosses. Preferred algae are 
green algae such as, for example, algae of the genus Haematococ- 
cus, Phaedactylum tricornatum, Volvox or Dunaliella. Particular 
preference is given to Synechocystis . 

30 

Particular preference is given to the group of plants, consisting 
of wheat, oats, millet, barley, rye, corn, rice, buckwheat, sor- 
ghum, triticale, spelt, linseed, sugar cane, oilseed rape, cress, 
Arabidopsis, cabbage species, soybean, alfalfa, pea, bean plants, 
35 peanut, potato, tobacco, tomato, eggplant, paprika, sunflower, 
tagetes, lettuce, calendula, melon, pumpkin and zucchini. 



40 



45 



Most preference is given to 

a) plants suitable for producing oil, such as, for example, 

oilseed rape, sunflower, sesame, safflower (Carthamus tincto- 
rius), olive tree, soybean, corn, peanut, ricinus, oil palm, 
wheat, cacao tree or various nut species such as, for exam- 
ple, walnut, coconut or almond. Among these, particular pref- 
erence is in turn given to dicotyledonous plants, in particu- 
lar oilseed rape, soybean and sunflower. 
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b) plants suitable for producing starch, such as corn, vheat or 
potato, for example. 

c) plants which are utilized as food and/or feedstuff and/or as 
useful plants and in vhich a resistance to pathogens would be 
advantageous, such as barley, rye, rice, potato, cotton, flax 
or linseed, for example* 

d) plants which may be suitable for producing fine chemicals 
such as, for example, vitamins and/or carotenoids, such as 
oilseed rape, for example. 



"Population of plant cells" means any group of plant cells, which 
^5 may be subjected within the scope of the present invention to a 
transformation and from which transgenic plant cells transformed 
by the process of the invention may be obtained and isolated. In 
this context, said population may also be, for example, a plant 
tissue, organ or a cell culture, etc. Said population may com— 
20 prise by way of example but not by limitation an isolated zygote, 
an isolated iiranature embryo, embryogenic callus, plant or else 
various flower tissues (both in vitro and in vivo). 

"Genome" means the entirety of genetic information of a plant 
25 cell and comprises both genetic information of the nucleus and 
that of the plastids (e.g. chloroplasts ) and mitochondria. Howev- 
er, genome preferably means the genetic information of the 
nucleus (for example of the nuclear chromosomes). 

"Selection" means identifying and/or isolating successfully 
transformed plant cells from a population of n ontr an s formed cells 
by using the process of the invention. This does not necessarily 
require that the selection be carried out directly with the 
transformed cells immediately after transformation. It is also 
possible to carry out the selection only at a later time, even 
with a later generation of the plant organisms (or cells, tis- 
sues, organs or propagation material derived therefrom) resulting 
from the transformation. Thus it is possible, for example, to 
transform Arabidopsis plants directly using, for example, the 
vacuum infiltration method (Clough S & Bent A (1998) Plant J 
16(6) :735-43/ Bechtold N et al. (1993) CR Acad Sci Paris 
1144(2) :204-212) , which subsequently produce transgenic seeds 
which may then be subjected to selection. 

45 

The fact that the nucleic acid sequence to be inserted is trans- 
formed "in combination with" the "anti-marker protein" compound 
(e.g* a DNA construct) is to be understood broadly and means that 
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at least: one nucleic acid sequence to be inserted and at least 
one "anti-marXer protein" compound are functionally coupled to 
one another so that the presence of the ''anti-marker protein" 
compound in the plant cell, and of the selection advantage re- 
5 lated thereto, indicates the parallel presence of the inserted 
nucleic acid sequence as likely. The nucleic acid sequence to be 
inserted and the "anti-marker protein" compound (e.g. a DNA 
construct) here may be, preferably but not necessarily, part of a 
single nucleic acid construct (e.g. a transformation construct or 

10 transformation vector), i.e. be present physicochemically coupled 
via a covalent bond. However, they may also be jointly introduced 
separately, for example in the course of a cotransf ormation, and 
exert their function within the scope of the process of the in- 
vention also in this way. In the case of the "anti-marker protein 

15 compound" acting via expressing an RNA (e.g. an antisense RNA or 
double-stranded RNA) or being such an RNA, "in combination" may 
also include those embodiments in which said RNA and the RNA ex- 
pressed by the nucleic acid sequence inserted into the genome 
form an RNA strand. 

20 "Nontoxic substance X" generally means substances which, compared 
to their reaction product Y, under otherwise identical condi- 
tions, have a reduced, preferably an essentially lacking biologi- 
cal, activity, preferably toxicity. In this context, the toxicity 
of substance Y is at least twice as high as that of substance X, 

25 preferably at least five times as high, particularly preferably 
at least ten tiroes as high, very particularly preferably at least 
twenty times as high, most preferably at least one hundred times 
as high. "Identical conditions" here means that all conditions 
are kept the same, apart from the different substances X and Y. 
Accordingly, identical molar concentrations of X and Y are used, 
with the medium, temperature, type of organism and density of pr- 
ganism, etc, being the same. The substance X may be converted to 
the substance Y in various ways, for example by hydrolysis, 
deamination, hydrolysis , dephosphorylation, phosphorylation, 
oxidation or any other type of activation, metabolization or 
conversion. The substance X may be, by way of example but not by 
limitation, the inactive precursor or derivative of a plant 
growth regulator or herbicide. 



40 



45 



"Toxicity" or "toxic effect" means a measurable, negative influ- 
ence on the physiology of the plant or of the plant cell and may 
comprise here symptoms such as, for example, but not limited 
thereto, a reduced or disrupted growth, a reduced or disrupted 
rate of photosynthesis, a reduced or disrupted cell division, a 
reduced or disrupted regeneration of a complete plant from cell 
culture or callus, etc. 
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The plant cells successfully transformed by means of the process 
of the invention may, to put it differently^ have a growth advan- 
tage or selection advantage over the nontrans formed cells of the 
same starting population under the influence of the substance 
5 "X" . Growth or selection advantage is to be understood here 
broadly and means , for example, the fact that said transformed 
plant cells eure capable of forming shoots and/or can be regener- 
ated to give complete plants, whereas the nontrans formed cells 
can do this only with a marked delay, if at all. 

.0 

The term of "marker protein" is to be understood broadly and gen- 
erally means all of those proteins which are capable of 

i) exerting per se a toxic effect on the plant or plant cell, or 



ii) converting directly or indirectly a nontoxic substance X into 
a substance Y which is toxic for the plant or plant cell. 

20 

In this context, the marker protein may be a plant-intrinsic, en- 
dogenous gene or else a transgene from a different organism. Pre- 
ferably, the marker protein itself has no essential function for 
the organism including the marker protein. If the marker protein 
25 per se exerts a toxic effect, then it will preferably be ex- 
pressed, for example, under an inducible promoter rather than 
constitutively . 

Preferably, however, the marker protein converts directly or in- 
3^ directly a nontoxic substance X into a substance Y which is toxic 
for the plant or plant cell- Particularly preferred marker pro- 
teins are the "negative selection markers" as are used, for exam- 
ple, in the course of targeted deletions from the genome. 

Examples of marker proteins which may be mentioned but which are 
not limiting are: 



(a) cytosine deaminases (CodA or CDase), with preference being 

given to using as the nontoxic substance X substances such as 
5-f luorocytosine {5-FC). Cytosine deaminases catalyze the 
deeunination of cytosine to give uracil {Kilstrup M et al. 
(1989) J Bacterid 171:2124-2127; Anderson L et al . (1989) 
Arch Microbiol 152:115-118). Bacteria and fungi which have 
CDase activity convert 5-FC to the toxic metabolite ("Y") 
5-f luorouracil (5-FU) (Polak A & Scholer HJ (1975) Chemother- 
apy (Basel) 21:113-130). 5-FC itself has low toxicity 
(Bennett JE, in Goodman and Gilman: the Pharmacological Basis 
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of Therapeutics. 8th ed., eds . Gilman AG et al, (Pergamon 
Press, New York) pp. 1165-1181). Howeverr 5-FU has a highly 
cytotoxic effect, since it is subsegfuently metabolized to 
fluoro-UTP (FUTP) and fluoro-dUMP (FdUMP) and thus inhibits 
RNA and DNA synthesis (Calabrisi P & Chabner BA in Goodman 
and Gilman: the Pharmacological Basis of Therapeutics. 8th 
ed,, eds. Gilman AG et al . (Pergamon Press, New York) pp. 
1209-1263); Damon LE et al. (1989) Phcurmac Ther 43;155-189). 

Cells of higher plants and mammalian cells have no signifi- 
cant CDase activity and cannot deaminase 5-FC (Polak A et al. 
(1976) Chemotherapy 22:137-153; Koechlin BA et al. (1966) 
Biochemical Pharmacology 15:434-446). In this respect, the 
CDase is introduced as a transgene (e.g. in the form of a 
transgenic expression cassette) into plant organisms in the 
course of the process of the invention- Corresponding trans- 
genic plant cells or organisms are then used as masterplants 
as starting material. Appropriate CDase sequences, transgenic 
plant organisms and the process of carrying out negative 
selection processes using, for example, 5-FC as nontoxic sub- 
stance X, are known to the skilled worker (WO 93/01281; US 
5,358,866; Gleave AP et al . (1999) Plant Mol Biol 
40(2) :223-35; Perera RJ et al. (1993) Plant Mol Biol 
23(4) :793-799; Stougaard J (1993) Plant J 3:755-761); EP-Al 
595 837; Mullen CA et al. (1992) Proc Natl Acad Sci USA 
89(l):33-37; Kobayashi T et al . (1995) Jpn J Genet 
70(3) :409-422; Schlaman HRM & Hooykaas PFF (1997) Plant J 
11:1377-1385; Xiaohui Wang H et al. (2001) Gene 272(1-2): 
249-255; Koprek T et al. (1999) Plant J 19 ( 6 ): 719-726 ; Gleave 
AP et al- (1999) Plant Mol Biol 40(2 ): 223-235 ; Gallego ME 
(1999) Plant Mol Biol 39(l):83-93; Salomon s & Puchta H 
(1998) EMBO J 17(20) :6086-6095; Thykjaer T et al. (1997) 
Plant Mol Biol 35(4) :523-530; Serino G (1997) Plant J 
12(3) r697-701; Risseeuw E (1997) Plant J 11( 4 ): 717-728; Blanc 
V et al. (1996) Biochimie 78 ( 6 ): 511-517 ; Corneille S et al . 
(2001) Plant J 27:171-178). Cytosine deaminases and the genes 
coding therefor may be obtained from a multiplicity of organ- 
isms, preferably microorganisms such as, for example, the 
fungi Cryptococcus neoformans, Candida albicans, Torulopsis 
glabrata, Sporothrix schenckii, Aspergillus, Cladosporium 
and Phialophora (JE Bennett, Chapter 50: Antifungal Agents, 
in Goodman and Gilman 's the Pharmacological Basis of Thera- 
peutics 8th ed., A.G. Gilman, ed., Pergamon Press, New York, 
1990) and the bacteria E.coli and Salmonella typhimurixim 
(Andersen L et al. (1989) Arch Microbiol 152:115-118). 
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The sequences, materials and processes disclosed in the con- 
text of said publications are hereby explicitly referred to. 

Particular preference is given to sequences according to Gen- 
Bank Acc- No: S56903, and to the modified codA sequences de- 
scribed in EP-Al 595 873, which make expression in eukaryotes 
possible. Preference is given here to nucleic acid sequences 
coding for polypeptides according to SEQ ID NO: 2 or, prefer- 
ably, 4, in particular the sequences according to SEQ ID NO: 
1 or, preferably, 3. 

(b) cytochrome P-450 enzymes, in particular the bacterial cytoch- 
rome P-450 SUl gene product (CyP105Al) from Streptomyces gri- 
seolus (strain ATCC 117 96), with preference being given to 
using as nontoxic substance X substances such as the pro 
sulfonylurea herbicide R7402 ( 2-methylethyl-2-3-dihydro- 
N-[ ( 4 , 6-dimethoxypyrimidin-2-yl ) aminocarbonyl ] -1 , 2-ben2oiso- 
thiazole-7- sulfonamide 1 , 1-dioxide) . Corresponding sequences 
and the process of carrying out negative selection processes 
using, for example, R7402 as nontoxic substance X are known 
to the skilled worker (O'Keefe DP et al. (19 94) Plant Physiol 
105:473-482? Tissier AF et al. (1999) Plant Cell 
11:1841-1852; Koprek T et al. (1999) Plant J 19( 6 ) : 7 19-726 ; 
O'Keefe DP (1991) Biochemistry 30 ( 2 ) : 447-55 ) . The sequences, 
materials and processes disclosed in the context of said pub- 
lications are hereby explicitly referred to. 

Particular preference is given to sequences according to Gen- 
Bank Acc. No: M32238. Preference is further given to nucleic 
acid sequences coding for the polypeptide according to SEQ ID 
NO: 6, in particular the sequence according to SEQ ID NO: 5. 

(c) indoleacetic acid hydrolases such as, for example, Agrobac- 
terium tumefaciens, tms2 gene product, with preference being 
given to using as nontoxic substance X substances such as 
auxin amide compounds or naphthaleneacet amide (NAM) (with NAM 
being converted to naphthaleneacet ic acid, a phytotoxic sub- 
stance). Corresponding sequences and the process of carrying 
out negative selection processes using, for example, NAM as 
nontoxic substance X are known to the skilled worker 
(Fedoroff NV & Smith DL (1993) Plant J 3:273-289; Upadhyaya 
NM et al. (2000) Plant Mol Biol Rep 18:227-223; Depicker AG 
et al. (1988) Plant Cell rep 104:1067-1071; Karlin-Neumannn 
GA et al. (1991) Plant Cell 3:573-582; Sundaresan V et al. 
(1995) Gene Develop 9:1797-1810; Cecchini E et al. (1998) Mu- 
tat Res 401(1-2) : 199-206; Zubko E et al. (2000) Nat Biotech- 
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nol 18:442-445). The sequences, materials and processes dis- 
closed in the context of said publications are hereby 
explicitly referred to. 

Particular preference is given to sequences according to Gen- 
Bank Acc. No: NC_0O33O8 (Protein_id=''NP_536128 . 1 ) , AE009419, 
AB016260 (Protein_id=''BAA87807 -1) and NC002147. Preference is 
further given to nucleic acid sequences coding for polypep- 
tides according to SEQ id NO: 8 or 10, in particular the se- 
quences according to SEQ ID NO: 7 or 9 . 

(d) haloalkane dehalogenases (dhlA gene product), for example 
from xanthobacter autotropicus GJIO. The dehalogenase hydro- 
lyzes dihaloalkanes such as 1 , 2-dichloroethane (DCE) to give 
halogenated alcohols and inorganic halides (Naested H et al. 
(1999) Plant J 18(5)571-576; Janssen DB et al. (1994) Annu 
Rev Microbiol 48 i 163-191; Janssen DB (1989) J Bacteriol 
171( 12) :6791-9) . The sequences, materials and processes dis- 
closed in the context of said publications are hereby explic- 
itly referred to- 

Particular preference is given to sequences according to Gen- 
Bank Acc. No: M26950. Preference is further given to nucleic 
acid sequences coding for the polypeptide according to SEQ ID 
NO: 12, in particular the sequence according to SEQ ID NO: 
11. 

(e) thymidine kinases (TK), in particular viral TKs from viruses 
such as Herpes simplex virus, SV40, cytomegalovirus. 
Varicella zoster virus r in particular the TK of Herpes sim- 
plex virus type 1 (TK HSV-1), with preference being given to 
using as nontoxic substance X substances such as Acyclovir, 
Ganciclovir or l,2-deoxy-2-f luoro-p-D-arabinofuranosil-5-io- 
douracil (FIAU) . Corresponding sequences and the process of 
carrying out negative selection processes using, for example. 
Acyclovir, Ganciclovir or FIAU as nontoxic substance X are 
known to the skilled worker (Czako M & Marton L (1994) Plant 
Physiol 104:1067-1071; Wigler M et al. (1977) Cell 

11(1) :223-232; McKnight SL et al . (1980) Nucl Acids Res 
8(24) :5949-5964; McKnight SL et al. (1980) Nucl Acids Res 
8(24) :5931-5948; Preston et al. (1981) J Virol 38 ( 2 ): 593-605 ; 
Wagner et al. (1981) Proc Natl Acad Sci USA 78 ( 3 ): 1441-1445; 
St. Clair et al.(1987) Antimicrob Agents Chemother 
31(6) :844-849) . The sequences, materials and processes dis- 
closed in the context of said publications are hereby explic- 
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itly referred to. 

particular preference is given to sequences according to Gen- 
Bank Acc. No: J02224, V00470 and V00467. Preference is also 
given to nucleic acid sequences coding for polypeptides ac- 
cording to SEQ ID NO: 14 or 16, in particular the sequences 
according to SEQ ID NO: 13 or 15, 

f) guanine phosphoribosyl transferases , hypoxanthine phosphor i- 
bosyl transferases or xanthine guanine phosphoribosyl trans- 
ferases, with preference being given to using as nontoxic 
substance X substances such as 6-thioxanthine or allopurinol. 
Preference is given to guanine phosphoribosyl transferases 
(gpt), for example from E. Coli (Besnard et al. (1987) Mol 
Cell Biol 7:4139; Mzoz and Moolten (1993) Human Gene Therapy 
4:589-595; Ono et al. (1997) Hum Gene Ther 8 ( 17 ): 2043-55 ) , 
hypoxanthine phosphoribosyl transferases (HPRT; Jolly et al . 
(1983) Proc Natl Acad Sci USA 80:477; Fonwick "The HGPRT Sys- 
tem", pp. 333-373, M. Gottesman (ed.)/ Molecular Cell Genet- 
ics, John Wiley and Sons, New York, 1985), xanthine guanine 
phosphoribosyl transferases, for example from Toxoplasma gon- 
dii (Knoll LJ et al.(1998) Mol Cell Biol 18 ( 2 ): 807-814 ; Don- 
ald RG et al. (1996) J Biol Chem 271 (24 ): 14010-14019 ) . The 
sequences, materials and processes disclosed in the context 
of said publications are hereby explicitly referred to- 

Particular preference is given to sequences according to Gen- 
Bank Acc. No: U10247 (Toxoplasma gondii HXGPRT), M13422 
(B. coli gpt) and X00221 (E. coli gpt) . Preference is also 
given to nucleic acid sequences coding for polypeptides ac- 
cording to SEQ ID NO: 18, 20 or 22, in particular the se- 
quences according to SEQ ID NO: 17, 19 or 21. 

(g) purine nucleoside phosphorylases (PNP; DeoD gene product), 

for example from E. coli, with preference being given to us- 
ing as nontoxic substance X substances such as 6-methylpurine 
deoxyribonucleoside. Corresponding sequences and the process 
of carrying out negative selection processes using, for exam- 
ple, 6-methylpurine deoxyribonucleoside as nontoxic substance 
X are known to the skilled worker (Sorscher EJ et al. (1994) 
Gene Therapy 1:233-238). The sequences, materials and pro- 
cesses disclosed in the context of said publications are 
hereby explicitly referred to. 

Particular preference is given to sequences according to Gen- 
Bank Acc. No: M60917. Preference is also given to nucleic 
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acid sequences coding for the polypeptide according -to SEQ ID 
NO: 24^ in particular the sequence according to SEQ ID NO: 
23. 

phosphonate monoester hydrolases which convert inactive ester 
derivatives of the herbicide glyphosate (e.g. glycerylglypho- 
sate) into the active form of the herbicide. Corresponding 
sequences and the process of carrying out negative selection 
processes using , for example, glycerylglyphosate are known to 
the skilled worker (US 5,254,801; Dotson SB et al. (1996) 
Plant J 10(2) :383-392; Dotson SB et al. (1996) J Biol Chem 
271(42): 25754-25761). The sequences, materials and processes 
disclosed in the context of said publications are hereby ex- 
plicitly referred to. 

Particular preference is given to sequences according to Gen- 
Bank Acc. No: U44852. Preference is also given to nucleic 
acid sequences coding for the polypeptide according to SEQ ID 
NO: 26, in particular the sequence according to SEQ ID NO: 
25. 

) aux-1 and, preferably, aux-2 gene products ^ for example of 
the Ti plasmids of Agrobacterium strains such as A.rhizogenes 
or A.tumef aciens (Beclin C et al. (1993) Transgenics Res 
2:4855); Gaudin V, Jouanin I*. (1995) Plant Mol Biol. 
28(1) :123-36. 

The activity of the two enzymes causes the plant cell to pro- 
duce indoleacetamide (IAA). Aux-1 encodes an indoleacetamide 
synthase (lAMS) and converts tryptophan into indoleacetamide 
(VanOnckelen et al. (1986) FEBS Lett. 198: 357-360). Aux-2 
encodes the enzyme indoleacetamide hydrolase (lAMH) and con- 
verts indoleacetamide, a substance without phytohormone ac- 
tivity, into the active auxin indoleacetic acid (Inze D et 
al. (1984) Mol Gen Genet 194:265-274; Tomashow et al. (1984) 
Proc Natl Acad Sci USA 81:5071-5075; Schroder et al. (1984) 
Eur J Biochem 138:387-391). The enzyme lAMH may also hydro- 
lyze a number of indoleamide substrates such as, for example, 
naphtha leneacet amide, the latter being converted into the 
plant growth regulator naphthaleneacetic acid (NAA) . The use 
of the lAMH gene as a negative selection marker is described, 
for example, in US 5,180,873. Corresponding enzymes have also 
been described in A. rhizogenes, A. vitis (Canaday J et al. 
(1992) Mol Gen Genet 235:292-303) and Pseudomonas savastanoi 
(Yamada et al. (1985) Proc Natl Acad Sci USA 82:6522-6526). 
The use as a negative selection marker for destroying partic- 
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ular cell tissues (e-g. pollen; OS 5,426,041) or transgenic 
plants (US 5,180,873) has been described. Corresponding se- 
quences and the process of carrying out negative selection 
processes using, for example, naphthaleneacetamide are known 
to the skilled worker (see above). The sequences, materials 
and processes disclosed in the context of said publications 
are hereby explicitly referred to. 



Particular preference is given to sequences according to the 
GenBank Acc. No: M61151, AF039169 and AB025110. Preference is 
also given to nucleic acid sequences coding for polypeptides 
according to SEQ ID NO: 28, 30, 32, 34 or 36, in particular 
the sequences according to SEQ ID NO: 27, 29, 31, 33 or 35. 



(j) adenine phosphor ibosyl transferases (APRT), with preference 
being given to using as nontoxic substance X substances such 
as 4-aminopyra20lopyriiiiidine. Corresponding sequences and the 
process of carrying out negative selection processes with use 
are known to the skilled worker (Wigler M et al. (197 9) Proc 
Natl Acad Sci USA 76 ( 3 ): 1373-6; Taylor et al. -The APRT Sys- 
tem", pp., 311-332, M. Gottesman (ed.). Molecular Cell Ge- 
netics, John Wiley and Sons, New York, 1985). 

k) methoxinine dehydrogenases, with preference being given to 
using as nontoxic substance X substances such as 2- 
amino-4-methoxybutanoic acid (methoxinine) which is converted 
into the toxic methoxyvinyl glycine (Margraff R et al. (1980) 
Experimentia 36: 846), 

1) rhizobitoxin synthases, with preference being given to using 
as nontoxic substance X substances such as 2-amino-4-methoxy- 
butanoic acid (methoxinine) which is converted into the toxic 
2-amino-4-[2-amino-3-hydroxypropyl]-trans-3-butanoic acid 
(rhizobitoxin) (Owens LD et al- (1973) Weed Science 
21:63-66), 

m) 5-methylthioribose (MTR) kinases, with preference being given 
to using as nontoxic substance X substances such as 5-(tri- 
fluoromethyl)thioribose (MTR analog, "subversive substrate") 
which is converted, via an unstable intermediate, into the 
toxic substance (Y) carbothionyl difluoride. The MTR kinase 
is a key enzyme of the methionine salvage pathway. Corre- 
sponding enzyme activities have been described in plants, 
bacteria and protozoa but not in mammals. MTR kinases of var- 
ious species have been identified owing to defined sequence 
motifs (Sekowska A et al. (2001) BMC Microbiol 1:15; 
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http : / /www . biomedcentral . com/ 1471-2180/1/15) . Corre sponding 
sequences and the process of carrying out negative selection 
processes using, for example, 5-( trif luoroinethyl) thioribose 
are known to the skilled worker and readily obtainable from 
the appropriate sequence database (e.g. GenBank) (Sekowska A 
et al. (2001) BMC Microbiol 1:15; Cornell KA et al. (1996) 
317:285-290). The sequences, materials and processes dis- 
closed in the context of said publications are hereby explic- 
itly referred to. 

However, a plant MTR kinase has not yet been identified unam- 
biguously and is provided within the scope of the process of 
the invention (SEQ ID NO: 39 and, respectively, 40). in addi- 
tion, homologs from other plant species are provided, namely 
from corn (SEQ ID NO: 59 and, respectively, 60), oilseed rape 
(SEQ ID NO: 61, 63 and, respectively, 62, 64), rice (SEQ ID 
NO: 65 and, respectively, 66) and soybean (SEQ ID NO: 67 and, 
respectively, 68 ) . 

Accordingly, the invention further relates to amino acid se- 
quences encoding a plant 5-n\ethylthioribose kinase, wherein 
said amino acid sequence contains at least one sequence se- 
lected from the group consisting of SEQ ID NO: 60, 62,- 64, 66 
or 68. 

Accordingly, the invention further relates to nucleic acid 
sequences encoding a plant 5-methylthioribose kinase, wherein 
said nucleic acid sequence contains at least one sequence se- 
lected from the group consisting of SEQ ID NO: 59, 61, 63, 65 
or 67. Even if said sequences are in parts only fragments of 
complete cDNAs, their length is nevertheless more than suffi- 
cient in order to ensure a use and functionality as antisense 
RNA or double-stranded RNA. Preference is given to using as 
marker protein a plant endogenous MTR kinase . Further endoge- 
nous plant MTR kinases may readily be identified by means of 
screening databases or gene libraries using conserved, MTK 
kinase-typical motifs. Said motifs may be derived from Fig. 
9a-b, for example. Such motifs may comprise, by way of exam- 
ple but not by limitation, the following sequences: 

E(V/I)GDGN(L/I)N(L/Y/F)V(F/y) , preferably EVGDGNIiN (Y/F ) V(F/y ) 

KQALPy(V/I)RC 

SWPMT ( R/K ) ERAYF 

PEVyHFDRT 

GMRY(I/I-)EPPHI 

CRLTEQWFSDPY 

HGDLH(S/T)GS 
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Further suitable motifs may be derived from Fig. 9a-b without 
difficulty. 

^ Particular preference is given to sequences according to Gen- 

Bank Acc. No: AF212863 or AC079674 (Protein_ID=AAG517 75 - 1 ) . 
Preference is also given to nucleic acid sequences coding for 
polypeptides according to SEQ ID NO: 38 or 40 r in particular 
the sequences according to SEQ ID NO: 37 or 39. 

LO 

n) alcohol dehydrogenases (Adh), in particular plant Adh-1 gene 
products, with preference being given to using as nontoxic 
substance X substances such as allyl alcohol which is con- 
verted in this manner into the toxic substance (Y) acrolein. 
Corresponding sequences and the process of carrying out nega- 
tive selection processes using, for example, allyl alcohol 
are known to the skilled worker and readily obtainable from 
the appropriate sequence database (e.g. GenBank) (Wisman E et 
al. (1991) Mol Gen Genet 226 ( 1-2 ): 120-8 ? Jacobs M et al. 
(1988) Biochem Genet 26 ( 1-2 ): 105-22 ; Schwartz D, (1981) Envi- 
ron Health Perspect 37:75-7). The sequences^ materials and 
processes disclosed in the context of said publications axe 
hereby explicitly referred to. 

25 

Particular preference is given to sequences according to Gen- 
Bank Acc. No: X77943, M12196, AF172282, X04049 or AF253472. 
Preference is also given to nucleic acid sequences coding for 
polypeptides according to SEQ ID NO: 42, 44, 46 or 48, in 
30 particular the sequences according to SEQ ID NO: 41, 43, 45 

or 47. 



(o) Further suitable negative selection markers are those se- 
quences which exert per se a toxic action on plant cells, 
such as, for example, diphtheria toxin A, ribonucleases such 
as barnase and also ribosome-inhibiting proteins such as ri- 
cin. In this context, these proteins are preferably expressed 
in the plant cells inducibly rather than constitutive ly. The 
induction is preferably carried out chemically, it being pos- 
sible, for example, to use the chemically inducible promoters 
mentioned below in order to ensure said chemically induced 
expression. 

"Reduction" or ''to reduce" is to be interpreted broadly in con- 
nection with a marker protein or with its amount, expression, ac- 
tivity and/or function and comprises the partial or essentially 
complete stopping or blocking, based on different cell-biological 
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mechanisms, of the functionality of a marker protein in a plant 
cell, plant or a part, tissue, organ, cells or seeds derived 
therefrom. 



^ A reduction for the purpose of the invention also comprises a re- 
duction of the amount of a marker protein down to an essentially 
complete lack of said marker protein (i.e. a lack of detect abil- 
ity of marker protein activity or marker protein function or a 
lack of immunological detectability of said marker protein)- in 

lO this context, expression of a particular marker protein {or of 

its amount, expression, activity and/or function) in a cell or an 
organism is reduced preferably by more than 50%, particularly 
preferably by more than 80%, very particularly preferably by more 
than 90%, roost preferably by more than 98%. Reduction means in 

^5 particular also the complete lack of the marker protein (or of 
its amount, expression, activity and/or function). In this con- 
text, activity and/or function mean preferably the property of 
the marker protein of exerting a toxic effect on the plant cell 
or the plant organism and, respectively, the ability to convert 

20 the substance X into the substance Y. The toxic effect caused by 
the marker protein is reduced preferably by more than 50%, par- 
ticularly preferably by more than 80%, very particularly prefer- 
ably by more than 90%, most preferably by more than 98%. "Reduc- 
tion" includes of course within the scope of the present 

25 invention also a complete, 100% reduction or removal of the mark- 
er protein (or of its amount, expression, activity and/or func- 
tion) (for example by deleting the marker protein gene from the 
genome ) • 

The invention comprises various strategies for reducing the ex- 
pression, amount, activity and/or function of the marker protein. 
The skilled worker appreciates the fact that a number of various 
methods are available in order to influence the expression, 
amount, activity and/or function of a marker protein in the de- 
sired way. Examples which may be mentioned but which are not lim- 
iting are: 



introducing at least one marker protein double-stranded ribo- 
nucleic acid sequence (MP-dsRNA) or an expression cassette or 
expression cassettes ensuring expression thereof. Included 
are those processes in which the MP-dsRNA is directed against 
a marker protein geiie (i.e. genomic DNA sequences such as 
promoter sequences) or a marker protein gene transcript (i.e. 
mRKA sequences ) • 



10 
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b) introducing at. least one marker protein antisense ribonucleic 
acid sequence (,MP-antisenseRNA) or an expression cassette en- 
suring expression thereof. Included are those processes in 
which the MP -anti sens eRNA is directed against a marker pro- 
tein gene (i.e. genomic DNA sequences) or a marker protein 
gene transcript (i.e. RNA sequences), a-anomeric nucleic acid 
sequences are also included - 

c) introducing at least one MP -ant is ens eRNA combined with a ri- 
bozyme or an expression cassette ensuring expression thereof 



d) introducing at least one marker protein sense ribonucleic 

acid sequence (MP-senseRNA) for inducing a cosuppression or 
an expression cassette ensuring expression thereof 

e) introducing at least one DNA- or protein-binding factor 

against a marker protein gene, marker protein rna or marker 
protein or an expression cassette ensuring expression thereof 



20 



25 



3D 



f) introducing at least one viral nucleic acid sequence causing 
degradation of the marker protein RNA or an expression cas- 
sette ensuring expression thereof 

g) introducing at least one construct for generating a function- 
al loss (e.g. generation of stop codons, shifts in the read- 
ing frame etc, ) on a marker protein gene^ for example by gen- 
erating an insertion, deletion, inversion or mutation in a 
marker protein gene. Preferably, knockout mutants may be gen- 
erated by means of targeted insertion into said marker pro- 
tein gene via homologous recombination or by introducing se- 
quence-specific nucleases against marker protein gene 
sequences . 



35 



It is known to the skilled worker that it is also possible to use 
other processes within the scope of the present invention in or- 
der to reduce a marker protein or its activity or function. For 
example, it may also be advantageous, depending on the type of 
the marker protein used, to introduce a dominant-negative variant 
of a marker protein or an expression cassette ensuring expression 
thereof. In this context, any single one of these processes may 
cause a reduction in the expression, amount, activity and/or 
function of a marker protein. A combined application is also con- 
ceivable. Further methods are known to the skilled worker and may 
comprise hindering or stopping the processing of the marker pro- 
tein, the transport of the marker protein or of its mRNA, the in- 
hibition of ribosome attachment, the inhibition of RNA splicing. 
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the induction of an enzyme degrading marker protein RNA and/or 
the inhibition of translational elongation or termination. 

The embodiments below will describe by way of example the indi- 
^ vidual preferred processes: 

a) Introducing a double-stranded ribonucleic acid sequence of a 
marker protein (MP-dsRNA) 

10 

The process of gene regulation by means of double-stranded RNA 
("double-stranded RNA interference"; dsRNAi) has been described 
many times for animal and plant organisms (e.g. Matzke MA et al. 
(2000) Plant Mol Biol 43:401-415; Fire A. et al (1998) Nature 

15 391:806-811; WO 99/32619; WO 99/53050; WO 00/68374; WO 00/44914; 
WO 00/44895; WO 00/49035; WO 00/63364). The processes and methods 
described in the references indicated are hereby explicitly re- 
ferred t-o. dsRNAi processes are based on -the phenomenon that si- 
multaneously introducing the complementary strand and contour 

20 strand of a gene transcript suppresses expression of the corre- 
sponding gene in a highly efficient manner. Preferably, the phe- 
notype caused is very similar to that of a corresponding knockout 
mutant (Waterhouse PM et al. (1998) Proc Natl Acad Sci USA 
95:13959-64). The dsRNAi process has proved to be particularly 

25 efficient and advantageous in reducing marker protein expression. 

Double-stranded RNA molecule means within the scope of the inven- 
tion preferably one or more ribonucleic acid sequences which, ow- 
ing to complementary sequences, are theoretically (e.g. according 

30 to the base pair rules by Watson and Crick) and/or actually (e.g. 
owing to hybridization experiments in vitro and/or in vivo) capa- 
ble of forming double-stranded HNA structures. The skilled worker 
is aware of the fact that the formation of double -stranded RNA 
structures represents a state of equilibrium. Preferably , the ra- 

35 tio of double-stranded molecules to corresponding dissociated 
forms is at least 1 to 10, preferably 1:1, particularly prefer- 
ably 5:1/ most preferably 10:1. 

The invention therefore further relates to double-stranded RNA 
molecules ( dsRNA-Molekiile ) which, when introduced into a plant 
organism (or into a cell, tissue, organ or propagation material 
derived therefrom) cause the reduction of at least one marker 
protein. The double-stranded RNA molecule for reducing expression 
of a marker protein (MP-dsRNA) here preferably comprises 

45 

a) a "sense" RNA strand comprising at least one ribonucleotide 
sequence which is essentially identical to at least a part of 
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the "sense" KNA transcript of a nucleic acid sequence coding 
for a marker protein / and 

b) an *'antisense" RNA strand which is essentially, preferably 
fully, complementary to the RNA sense strand under a) . 

With respect to the dsRNA molecules, marker protein nucleic acid 
sequence preferably means a sequence according to SEQ ID NO: 1^ 
3, 5, 7, 9, 11, 13r 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 
37, 39, 41, 43, 45 or 47 or a functional equivalent thereof. 

"Essentially identical" means that the dsRNA sequence may also 
have insertions, deletions and also individual point mutations in 
comparison with the marker protein target sequence and neverthe- 
less causes an efficient reduction in expression. The homology 
(as defined hereinbelow) between the "sense" strand of an inhibi- 
tory dsRNA and at least one part of the "sense" KNA transcript of 
a nucleic acid sequence coding for a market protein (or between 
the "antisense" strand of the complementary strand of a nucleic 
acid sequence coding for a marker protein) is preferably at least 
75%, preferably at least 80%, very particularly preferably at 
least 90%, most preferably 100%. 

25 A 100% sequence identity between dsRNA and a marker protein gene 
transcript is not absolutely necessary in order to cause an effi- 
cient reduction in marker protein expression. Consequently, the 
process is advantageously tolerant toward sequence deviations as 
may be present due to genetic mutations, polymorphisms or evolu- 
2^ tionary divergences. Thus it is possible, for example, using the 
dsRNA which has been generated starting from the marker protein 
sequence of the first organism, to suppress marker protein ex- 
pression in a second organism- This is particularly advantageous 
when the marker protein used is a plant-intrinsic, endogenous 
marker protein (for example a 5-methylthioribose kinase or alco- 
hol dehydrogenase). For this purpose, the dsRNA preferably in- 
cludes sequence regions of marker protein gene transcripts which 
correspond to conserved regions. Said conserved regions may be 
readily derived from sequence comparisons. 



20 



35 



40 



45 



The length of the subsection is at least 10 bases, preferably at 
least 25 bases, particularly preferably at least 50 bases, very 
particularly preferably at least 100 bases, most preferably at 
least 200 bases or at least 300 bases. 

Alternatively, an "essentially identical" dsRNA may also be de- 
fined as a nucleic acid sequence capable of hybridizing with part 
of a marker protein gene transcript (e.g. in 400 mM KaCl, 40 mM 
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PIPES pH 6.4, 1 mM EDTA at 50**C or 70**C for 12 to 16 h) - 



"Essentially complementary" means that the ^antisense" RNA strand 
may also have insertions, deletions and also individual point 
^ mutations in comparison with the complement of this "sense" KNA 
strand. The homology between the "antisense" RNA strand and the 
complement of the "sense" RNA strand is preferably at least 80%, 
preferably at least 90%, very particularly preferably at least 
95%, most preferably 100%* 

10 

"Part of the "sense" RNA transcript" of a nucleic acid sequence 
coding for a marker protein means fragments of an RNA or mRNA 
transcribed or transcribable from a nucleic acid sequence coding 

25 for a marker protein, preferably from a marker .protein gene. In 
this context, the fragments have a sequence length of preferably 
at least 20 bases, preferably at least 50 bases, particularly 
preferably at least 100 bases, very particularly preferably at 
least 200 bases, most preferably at least 500 bases. The complete 

20 transcribable RNA or mRNA is also included. Included are also se- 
quences such as those which may be transcribed under artificial 
conditions from regions of a marker protein gene which are other- 
wise, under natural conditions, not transcribed, such as promoter 
regions, for example. 

25 

The dsRNA may consist of one or more strands of polyribonucleo- 
tides. Naturally, in order to achieve the same purpose, it is 
also possible to introduce a plurality of individual dsRNA mole- 
cules which comprise in each case one of the above-defined ribo-- 

30 nucleotide sequence sections into the cell or the organism. The 
double- stranded dsRNA structure may be formed starting from two 
complementary, separate RNA strands or, preferably, starting from 
a single, self -complementary RNA strand. In this case, the 
"sense" RNA strand and the "antisense" RNA strand are preferably 

35 connected covalently to one another in the form of an inverted 
"repeat" - 



As described in WO 99/53050, for example, the dsRNA may also com- 
prise a hairpin structure by connecting the "sense" and the "an- 
tisense" strands by a connecting sequence ("linker"; for example 
an intron). Preference is given to the self -complementary dsRNA 
structures, since they require only the expression of an RNA se- 
quence and always comprise the complementary RNA strands in an 
equimolar ratio. The connecting sequence may is preferably an in- 
tron (e.g. an intron of the potato ST-liSl gene; Vancanneyt GF et 
al. (1990) Mol Gen Genet 220(2) ;245-250) . 
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The nucleic acid sequence coding for a dsRNA may include furliher 
elements such as, for excunple, transcription termination signals 
or poly adenylat ion signals. 

Bringing together ^ if intended, the two strands of the dsRNA in a 
cell or plant may be achieved by way of example in the following 
way: 

a) transformation of the cell or plant with a vector comprising 
both expression cassettes. 



b) cotransf ormation of the cell or plant with two vectors, one 
of which comprises the expression cassettes containing the 
"sense" strand and the other one of which comprises the ex- 
pression cassettes containing the ''antisense" strand* 



The formation of the RNA duplex may be initiated either outside 
or inside the cell. 

20 

The dsRNA may be synthesized either in vivo or in vitro. For this 
purpose, a DNA sequence coding for a dsRNA may be inserted into 
an expression cassette under the control of at least one genetic 
control element (such as a promoter, for example). A polyadenyla- 
25 tion is not necessary and neither need any elements for initiat- 
ing a translation be present. Preference is given to the expres- 
sion cassette for the MP-dsRNA being present on the 
transformation construct or the transformation vector. For this 
purpose, the expression cassettes coding for the "antisense'' 
Strand and/or the "sense" strand of an MP-dsRKA or for the self- 
complementary strand of the dsRNA are preferably inserted into a 
transformation vector and introduced into the plant cell by using 
the processes described below. A stable insertion into the genome 
may be advantageous for the process of the invention but is not 
absolutely necessary, since a dsRNA causes a long-term effect, 
transient expression is also sufficient in many cases. The dsRNA 
may also be part of the RNA to be expressed by the nucleic acid 
sequence to be inserted by fusing it, for example, to the 3 '-un- 
translated part of said RNA. 

40 

The dsRNA may be introduced in . an amount which makes possible at 
least one copy per cell. Higher amounts (e.g. at least 5, 10, 
100, 500 or 1000 copies per cell) may, if appropriate, cause a 
more efficient reduction. 



b) Introducing an antisense ribonucleic acid sequence of a mark- 
er protein (MP-antisenseRNA) 
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ProceBseB for reducing a part:icular protein by means of the "an- 
tisense" technique have been described multiple times, also in 
plants (Sheehy et al. (1988) Proc Natl Acad Sci USA 85: 
8805-8809; US 4,801,340; Mol JN et al. (1990) FEBS Lett 

5 268(2) :427-430) . The antisense nucleic acid molecule hybridizes 
or binds to the cellular mRNA and/or genomic DNA coding for the 
marker protein to be reduced, thereby suppressing transcription 
and/or translation of said marker protein. The hybridization may 
be produced in a conventional manner via the formation of a 

Q stable duplex or, in the case of genomic DNA, by binding of the 
antisense nucleic acid molecule to the duplex of the genomic DNA 
via specific interaction in the large groove of the DNA helix. 



An MP-antisenseRNA may be derived using the nucleic acid sequence 
3^5 coding for this marker protein, for example the nucleic acid se- 
quence according to SEQ ID NOr 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 
21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45 or 47 accord- 
ing to the base pair rules by Watson and Crick. The MP-antisen- 
seRNA may be complementary to the entire transcribed mRNA of the 
2Q marker protein, may be limited to the coding region or may con- 
sist only of an oligonucleotide which is complementary to a part 
of the coding or noncoding sequence of the mRNA. Thus, for exam- 
ple, the oligonucleotide may be complementary to the region com- 
prising the translation start site for the marker protein. The 
MP-antisenseRNA may be, for example, 5, 10^ 15, 20, 25, 30, 35, 
40, 45 or 50 nucleotides in length, but may also be longer and 
comprise at least 100, 200, 500, 1000, 2000 or 5000 nucleotides. 
MP-antisenseRNA are preferably expressed recombinantly in the 
target cell in the course of the. process of the invention. 

30 

The MP-antisenseRNA may also be part of an RNA to be expressed by 
the nucleic acid sequence to be inserted by being fused, for ex- 
ample, to the 3 ' -untranslated part of said RNA. 



The invention further relates to transgenic expression cassettes 
containing a nucleic acid sequence coding for at least part of a 
marker protein, with said nucleic acid sequence being functional- 
ly linked in antisense orientation to a promoter functional in 
plant organisms. Said expression cassettes may be part of a 
transformation construct or trans f ozonation vector or else may be 
introduced in the course of a cotrans formation . 



in a further preferred embodiment, expression of a marker protein 
may be inhibited by nucleotide sequences which are complementary 
to the regulatory region of a marker protein gene (e.g. a marker 
protein promoter and/or enhancer) and which form with the DNA 
double helix there triple-helical structures, thereby reducing 
transcription of the marker protein gene. Corresponding processes 
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have been described (Helene C (1991) Anticancer Drug Res 
6(6): 569-84; Helene C et al. (1992) Ann NY Acad Sci 660:27^36; 
Maher LJ (1992) Bioassays 14(12) :807-815) . 

5 In a fur-ther embodiment ^ the MP-antisenseRNA may be an a-anomeric 
nucleic acid. Such a-anomeric nucleic acid molecules form with 
complementary RNA specific double-stranded hybrids in which ^ in 
contrast to the conventional P-nucleic acids, the two strands are 
oriented parallel to one another (Gautier C et al. (1987) Nucleic 
10 Acids Res 15:6625-6641). 

c) Introducing an MP-antisenseRHA combined with a ribozyme 

Advantageously, the above -de scribed antisense strategy may be 
coupled to a ribozyme process. Catalytic RNA molecules or ribo- 
zymes may be adapted to any target RNA and cleave the phospho- 
diester backbone in specific positions, thereby functionally de- 
activating said target RNA (Tanner NK (1999) FEMS Microbiol Rev 
23 (3) :257-275) • In the process, the ribozyme is not modified it- 

20 self but is capable of cleaving in an analogous manner further 
target RNA molecules, thereby acquiring the properties of an en- 
zyme. The incorporation of ribozyme sequences into ''antisense" 
RNAs imparts specifically to these "antisense" RNAs this enzyme- 
like, RNA-cleaving property and thus increases their efficiency 

25 in inactivating the target RNA* The preparation and use of ap- 
propriate ribozyme ''antisense" RNA molecules have been described 
(inter alia in Haseloff et al. (1988) Nature 334: 585-591); Ha- 
selhoff and Gerlach (1988) Nature 334:585-591; Steinecke P et al. 
(1992) EMBO J 11(4):1525- 1530; de Feyter R et al. (1996) Mol Gen 

30 Genet. 250 ( 3 ): 329-338 ) . 

In this way, it is possible to use ribozymes (e.g. hammerhead ri- 
bozymes; Haselhoff and Gerlach (1988) Nature 334:585-591) in or- 
der to catalytically cleave the mRNA of a marker protein to be 

35 reduced and thus prevent translation. The ribozyme technique may 
increase the efficiency of an antisense strategy- Processes for 
expressing ribozymes in order to reduce particular proteins have 
been described in (EP 0 291 533, EP 0 321 201, EP 0 360 257). Ri- 
bozyme expression has likewise been described in plant cells 

40 (Steinecke P et al. (1992) EMBO J 11 ( 4 ): 1525-1530; de Feyter R 

et al. (1996) Mol Gen Genet. 250 (3) : 329-338) . Suitable target se- 
quences and ribozymes may be determined, for example, as de- 
scribed in "Steinecke P, Ribozymes, Methods in Cell Biology 50, 
Galbraith et al . eds. Academic Press, Inc. (1995), pp. 449-460", 
by calculating the secondary structures of ribozyme RNA and tar- 
get RNA and by the interaction thereof (Bayley CC et al. (1992) 
Plant Mol Biol. 18 ( 2 ) : 353-36 1 ; Lloyd AM and Davis RW et al . 
(1994) Mol Gen Genet. 242 ( 6 ): 653-657 ) . It is possible, for exam- 
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pie, to construct derivatives of the Tetrahyiaena L-IS IVS RNA 
which have regions complementary to the mRNA of the marker pro- 
tein to be suppressed (see also US 4,9B7,071 and US 5,116,742>, 
Alternatively, such ribozymes may also be identified via a selec- 
tion process from a library of various ribozymes (Bartel D and 
Szostak JK (1993) Science 261:1411-1418). 

d) Introducing a sense ribonucleic acid sequence of a marker 
protein (MP-senseRNA) for inducing a cosuppression 

Expression of a marker protein ribonucleic acid sequence (or a 
part thereof) in sense orientation may result in a cosuppression 
of the corresponding marker protein gene. Expression of sense RHA 
with homology to an endogenous marker protein gene may reduce or 
switch off expression of the latter, as has been described simi- 
larly for antisense approaches (Jorgensen et al. (1996) Plant Mol 
Biol 31(5) :957-973; Goring et al. (1991) Proc Natl Acad Sci USA 
88:1770-1774; Smith et al. (1990) Mol Gen Genet 224:447-481; Na- 
poli et al. (1990) Plant Cell 2:279-289; Van der Krol et al. 
(1990) Plant Cell 2:291-99). In this context, the introduced 
construct may represent completely or only partially the homolo- 
gous gene to be reduced. The possibility of translation is not 
required. The application of this technique to plants has been 
described (e.g. Napoli et al. (1990) Plant Cell 2:279-289; in 
US 5,034,323. 



The cosuppression is preferably carried out using a sequence 
which is essentially identical to at least part of the nucleic 
3Q acid sequence coding for a marker protein, for example the nucle- 
ic acid sequence according to SEQ ID NO: 1, 3, 5, 7 , 9, 11, 13, 
15, 17, 19, 21, 23, 25, 27, 29 ,31, 33, 35, 37, 39, 41, 43, 45 or 
47. 

35 The MP-senseRNA is preferably chosen in such a way that a 

translation of the marker protein or a part thereof cannot occur. 
For this purpose, for example, the 5 ' -untranslated or 3 '-untrans- 
lated region may be chosen or else the ATG start codon may be de- 
leted or mutated. 

40 

e) Introducing DNA- or protein-binding factors against marker 
protein genes, marker protein KNAs or proteins 

Marker protein expression may also be reduced using specific DHA- 
binding factors, for example factors of the zinc finger tran- 
scription factor type. These factors attach to the genomic se- 
quence of the endogenous target gene, preferably in the 
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regulatory regions, and cause a reduction in expression. Ap- 
propriate processes for preparing corresponding factors have been 
described (Dreier B et al. (2001) J Biol Chem 276(31) s29466-78; 
Dreier B et al. (2000) J Mol Biol 303 ( 4 ): 489-502 ; Beerli et 
5 al- (2000) Proc Natl Acad Sci USA 97 (4) :1495-1500? Beerli RR et 
al. (2000) J Biol Chem 275 ( 42 ): 32617-32627 ; Segal DJ and Barbas 
CF 3rd. (2000) Curr Opin Chem Biol 4(l)s34-39; Kang JS and Kim JS 
(2000) J Biol Chem 275( 12) :8742-8748; Beerli et al. (1998) 
Proc Natl Acad Sci USA 95 ( 25) : 14628- 14633; Kim JS et al. (1997) 
^0 Proc Natl Acad Sci USA 94{8)s3616 -3620; Klug A (1999) J Mol Biol 
293(2) :215-218; Tsai SY et al . (1998) Adv Drug Deliv Rev 
30(1-3) :23-31; Mapp AK et al. (2000) Proc Natl Acad Sci USA 
97(8)2 3930-3935; Sharrocks AD et al. (1997) Int J Biochem Cell 
Biol 29(12) :1371-1387; Zhang L et al. (2000) J Biol Chem 
275(43) :33850-33860) . 

These factors may be selected using any segment of a marker pro- 
tein gene. This section is preferably in the region of the pro- 
moter region. However, for gene suppression, it may also be in 
20 the region of the coding exons or introns. 



It is also possible to introduce factors which inhibit the marker 
protein itself into a cell. These protein-binding factors may be, 
for example, aptamers (Famulok M and Mayer G (1999) Curr Top Mi- 
25 crobiol Immunol 243:123-36) or antibodies or antibody fragments 
or single-chain antibodies* Obtaining these factors has been de- 
scribed (Owen M et al. (1992) Biotechnology (K Y) 10(7) :790-794; 
Franken E et al , (1997) Curr Opin Biotechnol 8(4):411-416; White- 
lam (1996) Trend Plant Sci 1:286-272). 

30 

f ) Introducing viral nucleic acid sequences and expression 
constructs causing the degradation of marker protein RNA 



Marker protein expression may also be effectively implemented by 
inducing the specific degradation of marker protein RNA by the 
plant with the aid of a viral expression system (Amplikon; Angell 
SM et al. (1999) Plant J 20 ( 3 ) s 357-362 ) . These systems, also re- 
ferred to as "VIGS" (viral induced gene silencing) , introduce nu- 
cleic acid sequences with homology to the transcript of a marker 
protein to be reduced into the plant by means of viral vectors. 
Transcription is then switched off, presumably mediated by plant 
defence mechanisms against viruses. Appropriate techniques and 
processes have been described (Ratcliff F et al- (2001) Plant J 
25(2) :237-45; Fagard M und Vaucheret H (2000) Plant Mol Biol 
43(2-3) :285-93; Anandalakshmi R et al. (1998) Proc Natl Acad Sci 
USA 95(22) :13079-84; Ruiz MT (1998) Plant Cell 10 ( 6 ) : 937-46 ) . 
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VTCS-mediated reduction is preferably implemented using a se- 
quence which is essentially identical to at least paort of the nu- 
cleic acid sequence coding for a marker protein, for example the 
nucleic acid sequence according to SEQ ID NO: 1^ 3, 5, 1 , 9, 11, 
5 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 
45 or 47. 



g) Introducing constructs for generating a functional loss or a 
functional reduction of marker protein genes 

10 

The skilled worker knows numerous possible processes of how to 
modify genomic sequences in a targeted manner. These include, in 
particular, processes such as the generation of knockout mutants 
15 by means of targeted homologous recombination, for example by 
generating stop codons, shifts in the reading frame etc. (Hohn B 
and Puchta H (1999) Proc Natl Acad Sci USA 96:8321-8323) or the 
targeted deletion or inversion of sequences by means of, for ex- 
ample, sequence-specific recombinases or nucleases (see below). 

20 

In a preferred embodiment, the marker protein gene is inactivated 
by introducing a sequence-specific recombinase. Thus it is pos- 
sible, for example, for the marker protein gene to include recog- 
nition sequences for sequence-specific recombinases or to be 
25 flanked by such sequences, and introducing the recombinase then 
deletes or inverts particular sequences of the marker protein 
gene/ thus leading to inactivation of the marker protein gene. A 
corresponding procedure is depicted diagrammatically in Fig. 1. 

30 

Appropriate processes for deletion/inversion of sequences by 
means of sequence-specific recombinase systems are known to the 
skilled worker. Exeunples which may be mentioned are the Cre/lox 
system of bacteriophage PI (Dale EC and Ow DW (1991) Proc Natl 
Acad sci USA 88:10558-10562; Russell SH et al. (1992) Mol Gen Ge- 
net 234:49-59; Osborne BI et al. (1995) Plant J 7:687-701), the 
yeast FLP/FRT system (Kilby NJ et al, (1995) Plant J 8:637-652; 
Lyznik lA et al. (1996) Nucl Acids Res 24:3784-3789)^ the Gin re- 
combinase of the Mu phage, the E. coli Pin recombinase and the 
R/RS system of the pSRl plasmids (Onouchi H et al.(1995) Mol Gen 
Genet 247:653-660; Sugita Ket al. (2000) Plant J. 22:461-469). In 
these systems, the recombinase (for exeonple Cre or FXiP) interacts 
specifically with its particular recombination sequences (34 bp 
lox-Sequenz and, respectively, 47 bp FRT sequence). Preference is 
given to the bacteriophage PI Cre/lox and the yeast FLP/FRT sys- 
tems. The FliP/FRT and cre/lox recombinase systems have already 
been applied in plant systems (Odell et al. (1990) Mol <5en Genet 
223:369-378). Preference is given to introducing the recombinase 
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by means of recombinant expression starting from an expression 
cassette included on a DNA construct* 



The activity or cumount of the marker protein may also be reduced 
by a targeted deletion in the marker protein gene, for example by 
sequence-specific induction of DNA double-strand breaks at a rec- 
ognition sequence for specific induction of DNA double-strand 
breaks in or close to the nucleic acid sequence coding for a 
marker protein. In its simplest embodiment (cf. Fig. 2, K and B> 
an enzyme is to this end introduced with the transformation 
construct, which generates at least one double-strand break in 
such a way that the resulting illegitimate recombination or dele- 
tion causes a reduction in the activity or amount of marker pro- 
tein, for example by inducing a shift in the reading frame or 
deletion of essential sec[uences. 



The efficiency of this approach may be increased by the sequence 
coding for the marker protein being flanked by sequences (A and, 

20 respectively^ A') which have a sufficient length and homology to 
one another in order to recombine with one another as a conse- 
quence of the induced double-strand break and thus to cause, due 
to an intramolecular homologous recombination^ a deletion of the 
sequence coding for the marker protein. Fig. 3 depicts diagram- 

25 mat ic ally a corresponding procedure in an exemplary embodiment of 
this variant • 



The amount, function and/or activity of the marker protein may 
also be reduced by a targeted insertion of nucleic acid sequences 
(for example of the nucleic acid sequence to be inserted within 
the scope of the process of the invention) into the sequence cod- 
ing for a marker protein (e.g. by means of interjnolecular homolo- 
gous recombination) . This embodiment of the process of the inven- 
tion is particularly advantageous and preferred, since, in 
addition to the general advantages of the process of the inven- 
tion, it makes it moreover also possible to insert the nucleic 
acid sequence to be inserted into the plant genome in a reproduc- 
ible, predictable, location-specific manner. This avoids the 
positional effects which otherwise occur in the course of a ran- 
dom, location-unspecif ic insertion (and which may manifest them- 
selves, for example, in the form of different levels of expres- 
sion of the transgene or in unintended inactivation of endogenous 
genes). Preference is given to using as an "anti-marker protein" 
compound in the course of this embodiment a DNA construct which 
comprises at least part of the sequence of a marker protein gene 
or neighbouring sequences and which can thus specifically recom- 
bine with said sequences in the target cell so that a deletion. 
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addition or substitution of at least one nucleotide alters the 
marker protein gene in such a way that the functionality of said 
marker protein gene is reduced or completely removed. The alter- 
ation may also affect the regulatory elents (e.g. the promoter) 
5 of the marker protein gene so that the coding sequence remains 
unaltered, but expression (transcription and/or translation) does 
not occur and is reduced. In conventional homologous recombina- 
tion, the sequence to be inserted is flanked at its 5' and/or 3' 
end by further nucleic acid sequences (A' and, respectively, B') 

lO which have a sufficient length and homology to corresponding se- 
quences of the marker protein gene (A and, respectively, B) for 
making homologous recombination possible • The length is usually 
in a range from several hundred bases to several kilobases (Thom- 
as KR and Capecchi MR (1987) Cell 51:503; Strepp et al. (1998) 

15 Proc Natl Acad Sci USA 95(8 ) :4368-4373) . The homologous recom- 
bination is carried out by transforming the plant cell containing 
the recombination construct by using the process described below 
and selecting successfully recombined clones based on the subse- 
quently inactivated marker protein. Although homologous recom- 

20 bination is a relatively rare event in plant organisms, a selec- 
tion pressure may be avoided by recombination into the marker 
protein gene, allowing a selection of the recombined cells and 
sufficient efficiency of the process. Fig. 4 diagraromatically de- 
picts a corresponding procedure in an exemplary embodiment of 

25 this variant. 



In an advantageous embodiment of the invention, however, inser- 
tion into the marker protein gene is facilitated by means of fur- 
ther functional elements. The term is to be understood as being 
comprehensive and means the use of sequences or of transcripts or 
polypeptides derived therefrom which are capable of increasing 
the efficiency of the specific integration into a marker protein 
gene. Various processes are available to the skilled worker for 
this purpose. However, preference is given to implementing the 
insertion by inducing a sequence-specific double-strand break in 
or close to the marker protein gene. 



In a preferred embodiment of the invention, the marker protein is 
inactivated (i.e. the amount, expression, activity or function is 
reduced) by integrating a DNA sequence into a marker protein 
gene, with the process preferably comprising the following steps: 

i) introducing an insertion construct and at least one enzyme 
suitable for inducing DNA double-strand breaks at a recogni- 
tion sequence for targeted induction of DMA double-strand 



PF 53790 



CA 02493364 2005-01-21 



34 

breaks in or close "bo -the marker prot.eln gene, and 

ii) inducing DNA double -strand breaks at the recognition se- 
quences for targeted induction of DNA double- strand breaks in 
or close to the marker protein gene, and 



iii) inserting the insertion construct into the marker protein 
gene, with the functionality of the marker protein gene and, 
preferably, the functionality of the recognition sequence for 
targeted induction of DNA double-strand breaks is inactivated 
so that the enzyme suitable for induction of DNA double- 
strand breaks can no longer cut said recognition sequence, 
and 

15 

iv) selecting plants or plant cells in which the insertion 
construct has been inserted into the meorker protein gene. 



The insertion construct, preferably, comprises the nucleic acid 
2^ sequence to be inserted into the genome but may also be used sep- 
arately therefrom. 

"Enzyme suitable for inducing DNA double-strand breaks at the 
recognition sequence for targeted induction of DNA double-strand 
breaks" ("DSBI enzyme" for " double-strand- break inducing enzyme " 
hereinbelow) means generally all those enzymes which are capable 
of generating sequence-spec if ically double-strand breaks in 
double-stranded DNA. Examples which may be mentioned but which 
are not limiting are: . 

30 

1. Restriction endonucleases, preferably type II restriction en- 
donucleases, particularly preferably Homing endonucleases as 
described in detail hereinbelow. 



35 2. Artificial nucleases as described in detail hereinbelow, such 
as, for example, chimeric nucleases, mutated restriction or 
Homing endonucleases or RNA protein particles derived from 
group II mobile introns. 

Both natural and artificially prepared DSBI enzymes are suitable. 
Preference is given to all of those DSBI enzymes whose recogni- 
tion sequence is known and which can either be obtianed in the 
form of their proteins (for example by purification) or be ex- 
pressed using their nucleic acid sequence. 

45 

Preference is given to selecting the DSBI enzyme, with the knowl- 
edge of its specific recognition sequence, in such a way that it 
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possesses r apart from the target recognition sequence, no further 
functional recognition regions in the genome of the target plant ♦ 
Very particuleoc preference is therefore given to Homing endonu* 
cleases (overview: Belfort M and Roberts RJ (1997) Nucleic Acids 
5 Res 25:3379-3388; Jasin M (1996) Trends Genet 12:224-228? Inter- 
net: http://rebase.neb.com/rebase/rebase.hoining.htinl; Roberts RJ 
and Macelis D (2001) Nucl Acids Res 29: 268-269). The latter ful- 
fill said requirement, owing to their long recognition sequences. 
The sequences coding for Homing endonucleases of this kind may be 
10 isolated, for example, from the Chlamydomonas chromoplast genome 
(Turmel M et al. (1993) J Mol Biol 232:446-467). Suitable Homing 
endonucleases are listed under the abovementioned internet ad- 
dress. Excunples of Homing endonucleases which may be mentioned 
are those like F-SceX, F-Scell, F-Suvl, F-TevI, F-TevII, J-AmaX, 
15 I-Anil, I-Ceul, I->CeuAIIP, I-Chul, I-Cmoel, I-Cpal, I-Cpall, 
I-Crel, I-CrepsblP, I-CrepsbllP, I-CrepsblllP, I-CrepsblVP, 
I-Csmi, I-Cvul, I-CvuAIP, I-Ddill, I-Dirl, I-Dmol, X-HspNIP, 
I-Llal, I-Msol, I^Naal, I-NanI, I-NclIP, I--NgrlP, I-NitI, I-Njal, 
I-Nsp236lP, I-PakI, X-PboIP, I-PcuIP, I-PcuAI, I-PcuVI, I-PgrIP, 
20 I-PoblP, I-Porl, I-PorllP, I-PpblP, I-Ppol, I-SPBetalP, I-Scal, 
I-Scel, I-Scell, I-Scelll , I-ScelV, I-SceV, I-SceVI, I-SceViX, 
I-SexIP, I-SnelP, I-SpomCP, I-SpomXP, I-SpomllP, I-SquIP, I- 
Ssp6803l, X-SthPhiJP, I-SthPhiST3P, I-SthPhiS3bP , X-TdelP, 
I-Tevi, I-TevII, I-TevIII, I-UarAP, I-UarHGPAlP, I-UarHGPA13P , 
25 I-VinIP, I-ZbilP, PI-MtUi, PI-MtuHIP, PI-MtuHIIP, Pl-Pful, PI- 
PfuIX, Pl-PkoX, Pl-Pkoll, PI-PspI, PI-Rma4 3812IP, PI-SPBetalP, 
PX-Scel, PI^Tful, PI-TfuII, PI-Thyl, PI-Tlil, PI-Tlill. Prefer- 
ence is given here to those Homing endonucleases whose gene se- 
quences are already known, such as, for example, F-Scel, I-Ceui, 
30 I-Chul, I-DmoX, I-Cpal, I-CpalX, I-Crel, I-CsmI, F-TevX, F-TevII, 
I-TevI, I-TevII, I-Anil, I-Cvul, X-Llal, I-NanI, I-Msox, I-NitI, 
X-NjaX, X-PakX, X-PorX, X-PpoX, X-ScaX, X-Ssp6803X, PX-PkoX, PX- 
PkoIX, PI-PspI, PX-TfuI, PI-TliX. 

35 Very particular preference is given to 

X-CeuX (Cote MJ and Turmel M (1995) Curr Genet 27:177-183.; 
Gauthier A et al. (1991) Curr Genet 19:43-47; Marshall (1991) 
Gene 104:241-245; GenBank Acc. No.: Z17234 nucleotides 5102 

40 

to 5758), 

X-ChuX (Cote V et al.(1993) Gene 129:69-76; GenBank Acc. Mo.: 
I1O6IO7, nucleotides 419 to 1075), 

45 
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I-CmoeX (Drouin M et al, (2000) Nucl Acids Res 28:4566- 
4572), 

I-Cpal from Chlamydomonas pallldostigmatlca (GenBank Acc. 
No. 5 1.36830, nucleotides 357 to B15; Turmel M et al. (1995) 
Nucleic Acids Res 23:2519-2525? Turmel, M et al. (1995) 
Mol Biol Evol 12:533-545) 

I-Cpall (Turmel M et al. (1995) Mol Biol Evol 12:533-545? 
GenBank Acc. No.: 1^39865, nucleotides 719 to 1423), 

l-crel (Wang J et al. (1997) Nucleic Acids Res 25s 3767-3776; 
Durrenberger, F and Rochaix JD (1991) EMBO J 10:3495-3501; 
GenBank Acc. No.: X01977, nucleotides 571 to 1062), 

I-Csml (Ma DP et al. (1992) Plant Mol Biol 18:1001-1004) 

I-Nanl (Elde M et al- (1999) Eur J Biochem. 259:281-288? Gen- 
Bank ACC. NO.: X78280, nucleotides 418 to 1155), 

I-NitI (GenBank Acc. No.: X78277, nucleotides 426 to 1163), 

I-Njal (GenBank Acc. No.: X78279, nucleotides 416 to 1153), 

I-Ppol (Muscarella DE and Vogt VM (1989) Cell 56:443-454? 
Lin J and Vogt VM (1998) Mol Cell Biol 18:5809-5817? GenBank 
Acc. No.: M38131, nucleotides 86 to 577), 

I-Pspl (GenBank Acc. No.: U00707, nucleotides 1839 to 3449), 

I-Scal (Monteilhet C et al. (2000) Nucleic Acids Res 28: 
1245-1251? GenBank Acc. No.: X95974, nucleotides 55 to 465) 

I-Scel (WO 96/14408? US 5,962,327, therein Seq ID NO: 1), 

Endo Scel (Kawasaki et al. (1991) J Biol Chem 266:5342-5347, 
identical to F-Scel? GenBank Acc. No.: M63839, nucleotides 
159 to 1589) , 

I-Scell (Sarguiel B et al. (1990) Nucleic Acids Res 
18:5659-5665) , 

I-Scelll (Sarguiel B et al. (1991) Mol Gen Genet. 
255:340-341), 
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I-.Ssp6803I (GenBank Acc. No.: D64003, nucleotides 35372 to 
35824) , 



I-Tevi (Chu et al* (1990) Proc Natl Acad Sci USA 
87:3574-3578; Bell-Pedersen et al. (1990) Nucleic Acids 
Resl8:3763-3770; GenBank Acc. No.: AF158101, nucleotides 
144431 to 143694), 



I-TevII (Bell-Pedersen et al» (1990) Nucleic Acids Res 
18:3763-3770; GenBank Acc. No.: AF158101, nucleotides 45612 
to 44836), 



I-TevlII (Eddy et al. (1991) Genes Dev. 5:1032-1041). 

15 

Very particular preference is given to commercially available 
Homing endonucleases such as I-Ceul, X-Scel, I-Ppol, PI-PspI or 
Pl-Scel. Most preference is given to I-Scel and I-Ppol . while the 
gene coding for I-Ppol may be utilized in its natural form, the 
20 gene coding for I-Scel possesses an editing site. Since, in con- 
trast to yeast mitochondria, the appropriate editing is not car- 
ried out in higher plants, an artificial sequence encoding the 
I-Scel protein must be used for heterologous expression of this 
enzyme (US 5,866,361). 

25 

The enzymes may be purified from their source organisms in the 
manner familiar to the skilled worker and/or the nucleic acid se- 
quence encoding said enzymes may be cloned. The sequences of var- 
ious enzymes have been deposited with GenBank (see above). 

30 

Artificial DSBI enzymes which may be mentioned by way of example 
are chimeric nucleases which are composed of an unspecific nu- 
clease domain and a sequence-specific DNA-binding domain (e.g. 
consisting of zinc fingers) (Smith J et al. (2000) Nucl Acids 
Res 28(17) :3361-3369; Bibikova M et al. (2001) Mol Cell Biol- 
21:289-297). Thus, for example, the catalytic domain of the re- 
striction endonuclease Fokl has been fused to zinc finger-binding 
domains, thereby defining the specificity of the endonuclease 
(Chandrasegaran S 6 Smith J (1999) Biol Chem 380:841-848; Kim YG 
& Chandrasegaran S (1994) Proc Natl Acad Sci USA 91:883-887; Kim 
YG et al. (1996) Proc Natl Acad Sci USA 93:1156-1160). The de- 
scribed technique has also been used previously for imparting a 
predefined specificity to the catalytic domain of the yeast Ho 
endonuclease by fusing said domain to the zinc finger domain of 
transcription factors (Nahon E & Raveh D (1998) Nucl Acids Res 
26:1233-1239). It is possible, using suitable mutation and selec- 
tion processes, to adapt existing Homing endonucleases to any de- 
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sired recognition sequence. 

AS mentioned, zinc finger proteins are particularly suitable as 
DMA-binding domains within chimeric nucleases. These DNA-binding 
^ zinc finger domains may be adapted to any DNA sequence. Appropri- 
ate processes for preparing corresponding zinc finger domains 
have been described and are known to the skilled worker (Beerli 
RR et al. (2000) Proc Natl Acad Sci 97 ( 4 ): 1495-1500; Beerli RR 
et al-(2000) J Biol Chem 275 ( 42 ): 32617-32627; Segal DJ and Barbas 
CF 3rd. (2000) Curr Opin Chem Biol 4(l):34-39; Kang JS and Kim JS 
(2000) J Biol Chem 275 ( 12 ): 8742-8748 ; Beerli RR et al. (1998) 
Proc Natl Acad Sci USA 95 ( 25 ): 14628-14633 ; Kim JS et al. (1997) 
proc Natl Acad Sci USA 94(8) :3616-3620; Klug A (1999) J Mol Biol 
293 (2) :215-21B; Tsai SY et al. (1998) Adv Drug Deliv Rev 
30(1-3) :23-31; Mapp AK et al. (2000) Proc Natl Acad Sci USA 
97(8) : 3930-3935; Sharrocks AD et al. (1997) Int J Biochem Cell 
Biol 29(12) : 1371-1387; Zhang L et al. (2000) J Biol Chem 
275(43) :33850-33860) . Processes for preparing and selecting zinc 
finger DNA-binding domains with high sequence specificity have 
been described (WO 96/06166, WO 98/53059, WO 98/53057). Fusing a 
DNA-binding domain obtained in this way to the catalytic domain 
of an endonuclease (such as, for example, the Fokl or Ho endonu- 
clease) enables chimeric nucleases to be prepared which have any 
desired specificity and which may be used as DSBX enzymes advan- 
tageously within the scope of the present invention. 

Artificial DSBI enzymes with altered sequence specificity may 
also be generated by mutating already known restriction endonu- 

30 cleases or Homing endonucleases , using methods familiar to the 
skilled worker. Besides the mutagenesis of Homing endonucleases, 
the mutagenesis of maturases is of particular interest for the 
purpose of obtaining an altered substrate specificity. Maturases 
frequently share many features with Homing endonucleases and, if 

35 appropriate, can be converted into nucleases by carrying out few 
mutations. This has been shown,, for example, for the maturase in 
the bakers' yeast bi2 intron. Only two mutations in the maturase- 
encoding open reading frame (ORP) sufficed to impart to this en- 
zyme a Homing-endonuclease activity (Szczepanek & Lazowska (1996) 

40 EMBO J 15:3758-3767). 

Further axtificial nucleases may be generated with the aid of mo- 
bile group II introns and the proteins encoded by them, or parts 
of these proteins. Mobile group II introns, together with the 
proteins encoded by them, form RNA-protein particles which are 
capable of recognizing and cutting DNA in a sequence-specific 
manner. In this context, the sequence specificity can be adapted 



PF 537 90 



CA 02493364 2005-01-21 



39 

t-o the requirements by mutating particular regions of the intron 
(see below) (WO 97/10362). 

Preference is given to expressing the DSBI enzyme as a fusion 
5 protein with a nuclear localization sequence (Nl/S>. This NL»S se- 
quence enables facilitated transport into the nucleus and in- 
creases the efficiency of the recombination system. Various NLS 
sequences are known to the skilled worker and described, inter 
alia, in Jicks GR and Raikhel NV (1995) Annu. Rev. Cell Biol. 
10 11:155-188. For example, the NLS sequence of the SV40 large anti- 
gen is preferred for plant organisms. Very paxticular preference 
is given to the following NLS sequences: 

NLSl : N-Pro-Lys-Thr-r,ys-Arg-Iiys-Val-C 

15 

IILS2 : N-Pro-Lys-Lys-Lys-Arg-Lys-Val-C 

Owing to the small size of many DSBX enzymes (such as, for exain- 
pie, the Homing endonucleases ) , an KLS sequence is not absolutely 
necessary, however- These enzymes are able to pass through the 
nuclear pores also without this assistance. 

"Recognition sequence for targeted induction of DNA double-strand 
25 breaks" means in general those sequences which allow recognition 
and cleavage by the DSBI enzyme under the conditions in the euka- 
ryotic cell or organism used in this case. In this context, men- 
tion is made, by way of example but not by limitation, in table 1 
below of the recognition sequences for the particular DSBX en- 
zymes listed. 

Table 1: Recognition sequences and source organisms of DSBI 

enzymes (^^^ indicates the cleavage site of the DSBI 
enzyme within a recognition sequence) 

35 



40 



45 



DSBI 
enzyme 


Source 
organism 


Recognition sequence 


CR£ 


Bacteriophage 
PI 


5 ' -AACTCTCATCGCTTCGGATAACTTCCTGTTATCCGAAACAT 
ATCACTCACTTTGGTGATTTCACCGTAACTGTCTATGATTAATG 
-3' 


FI.P 


Saccharomyces 
cerevisiae 


5 ' -GAAGTTCCTATTCCGAAGTTCCTATTCTCTAGAAAGTA- 
TAGGAACTTC-3' 


R 


pSRl 

plasmids 


5 ' -CGAGATCATATCACTGTGGACGTTGATGAAAGAATACGTTA 
TTCTTTCATCAAATCGT 
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5 



20 



35 



P-element 

transpo- 

sase 


Drosophlla 


5 ' -CTAGATGAAATAACATAAGGTGG 


I -Anil 


Aspergillus 
nldulans 


5 ' "TTGAGGAGGTT'^TCTCTGTAAATAAlinnraNNNNNNNNNNN 
3 ' -AACTCCTCCAAAGAGACATTTATTNlSmNNinnraNNNOTIN'' 


I-Ddil 


Dictyosteliiun 
dlscoldeumAX3 


5 • ^TTTTTTGGTCATCCAGAAGTATAT 
3 ' -AAAAAACCAG^TAGGTCTTCATATA 


I-Cvul 


Chlorella 
vulgaris 


5 ' -CTGGGTTCAAAACGTCGTGA^GACAGTTTGG 
3 ' -GACCCAAGTTTTGCAG^CACTCTGTCAAACC 


I-CsmI 


Chlamydomonas 
smit-hii 


5 ' -GTACTAGCATGGGGTCAAATGTCTTTCTGG 


I-Cinoel 


ChX amy doiaonas 
zRoewusii. 


5 '-TCGTAGCAGCT^CACGGTT 
3 ' -AGCATCG*TCGAGTGCCAA 


I-Crel 


Chlaznydoznonas 
rexnhard1:ii 


5 ' -CTGGGTTCAAAACGTCGTGA'^GACAGTTTGG 
3 '-GACCCAAGTTXTGCAG*CACTCTGTCAAACC 


I-Chul 


Chlamydomonas 
humlcola 


5 ' -GAAGGTTTGGCACCTCG-^ATGTCGGCTCATC 
3 ' -CTTCCAAACCGTG''GAGCTACAGCCGAGTA6 


I-Cpal 


Chlamydomonas 

pallidostig- 

ma-tlca 


5 • -CGATCCTAAGGTAGCGAA'^ATTCA 
3 ' -GCTAGGATTCCATC^GCTTTAAGT 


I-Cpall 


Chlamydomonas 

pallidostig- 

matica 


5 ' -CCCGGCTAACTCTGTGCCAG 
3 * -GGGCCGAT^TGAGACACGGTC 


I-Ceul 


Chlamydomonas 
eugametos 


5 '-CGTAACTATAACGGTCCTAA-^GGTAGCGAA 
3 ' -GCATTGATATTGCCACGATTCCATCGCTT 


Z-DmoX 


Desulfurococ- 
cus mobilis 


5 ' -ATGCCTTGCCGGGTAA^GTTCCGGCGCGCAX 
3 ' -TACGGAACGGCC-^CATTCAAGGCCGCGCGTA 


I-SceX 


S • cerevlsiae 


5 ' -AGTTACGCTAGGGATAA^CAGGGTAATATAG 
3 ' -TCAATGCGATCCC^TATTGTCCCATTATATC 
5 ' -TAGGGATAA'^CAGGGTAAT 

a'-ATCCC^TATTGTCCCATTA ("Core" sequence) 


I-Scell 


S,cerevisiae 


5 ' -TTTTGATTCTTTGGTCACCC^TGAAGTATA 
3 ' -AAAACTAAGAAACCACTGGGACTTCATAT 


I-Scelll 


S. cerevlsiae 


5 ' -ATTGGAGGTTTTGGTAAC^TATTTATTACC 
3 ' -TAACCTCCAAAACC^ATTGATAAATAATGG 


I-ScelV 


S . cerevlsiae 


5 ' -TCTTTTCTCTTGATTA'^GCCCTAATCTACG 
3 ' -AGAAAAGAGAACTAATCGGGATTAGATGC 
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I-SceV 


S.cerevlsiae 


5 ' -AATAATTTTCT'TCTTAGTAATGCC 
3 ' -TTATTAAAAGAAGAATCATTA'^CGG 


5 


I-SceVl 


S . cerevisiae 


5 ' -GTTATTTAATCTTTTAGTAGTXGG 
3 ' -CAATAAATTACAAAATCATCA'^ACC 




I-SceVII 


S . cerevisiae 


5 ♦ -TGTCACATTGAGGTGCACTAGTTATTAC 




Pl-Scel 


S.cerevlsiae 


5 • -ATCTATGTCGGGTGC^GGAGAAAGAGGTAAT 
3 ' -TAGATACAGCC*CACGCCTCTTTCTCCATTA 












F-Scel 


S .cerevisiae 


5 ' -GATGCTGTAGGC^ATAGGCTTGGTT 
3 ' -CTACGACA'^TCCGTATCCGAACCAA 


15 


F-Scell 


S .cerevisiae 


5 ' -CTTTCCGCAACA'^GTAAAATT 
3 ' — GAAAGGCG^'TTGTCATTTTAA 




I-Hmul 


Bacillus sub- 
bills bacte- 
rlopbage SPOl 


5 ' -AGTAATGAGCCTAACGCTCAGCAA 
3 ' -TCATTACTCGGATTGC'*'GAGTCGTT 


20 


I-Hmull 


Bacillus 
subt.ilis 
bacteriophage 
SP82 


5 --AGTAATGAGCCTAACGCTCAAvIAANNNMMNN£4^ 
NNNNNNNNNNNNNNNNNNNNNN 


25 


I-Llal 


Lacbococcus 
lactis 


5 ' -CACATCCATAAC^CATATCATTTTT 
3 ' -GTGTAGGTATTGGTATAGTAA'^AAA 




I-Msol 


Monoraastix 
species 


5 ' -CTGGGTTCAAAACGTCGTGA^GACAGTTTGG 
3 '-GACCCAAGTTTTGCACCACTCTGTCAAACC 


30 


I -Nam 


Maegleria 
andersonl 


5 ' -AAGTCTGGTGCCA^GCACCCGC 
3 ' -TTCAGACC^ACGGTCGTGGGCG 




I-NitI 


Kaegleria 
itallca 


5 ' -AAGTCTGGTGCCA'^GCACCCGC 
3 ' -TTCAGACC^ACGGTCGTGGGCG 


35 


I-Njal 


I^aegleria 
jainlesonl 


5 ' -AAGTCTGGTGCCA'GCACCCGC 
3 ' -TTCAGACC^ACGGTCGTGGGCG 




I-PakI 


Pseudendoclo- 
nium akinetum 


5 ' -CTGGGTTCAAAACGTCGTGA^GACAGTTTGG 
3 ' -GACCCAAGTTTTGCAG'^CACTCTGTCAAACC 


40 


I-Porl 


Pyrobaculuin 
or g a no tr op hum 


5 ' -GCGAGCCCGTAAGGGT'^GTGTACGGG 
3 ' -CGCTCGGGCATT'^CCCACACATGCCC 




I-Ppol 


Physarum 
polycephalum 


5 ' -TAACTATGACTCTCTTAA-^GGTAGCCAAAT 
3 ' -ATTGATACTGAGAG^AATTCCATCGGTTTA 


45 


I^ScaZ 


Saccharotnyces 
capensis 


5 '-TGTCACATTGAGGTGCACT^AGTTATTAC 
3 ' -ACAGTGTAACTCCAC^GTGATCAATAATG 
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I-Ssp6803I 


Synechocystis 
species 


5 ' -GTCGGGCT^CATAACCCGAA 
3 ' -CAGCCCGAGTA'^TTGGGCTT 


5 


Pl-Pful 


Pyrococcus 
furiosus Vcl 


5 ' -GAAGATGGGAGGAGGG ACCGGACTCAACTT 
3 ' -CTTCTACCCTCC^TCCCTGGCCTGAGTTGAA 




Pl-Pfull 


Pyrococcus 
furiosus Vcl 


5 ' -ACGAATCCATGTGGAGA'^AGAGCCTCTATA 
3 ' -TGCTTAGGTACAC^CTCTTCTCGGAGATAT 


10 


Pl-Pkol 


Pyrococcus 
kodakar aens is 
KODl 


5 '-GATTTTAGAT^CCCTGTACC 
3 ' -CTAAAA'^TCTAGGGACATGG 




Pl-PkoII 


Pyrococcus 
kodakar aens is 
KODl 


5 ' -CAGTACTACCGTTAC 
3 ' -GTCATC^ATGCCAATG 


15 










PI-PspI 


Pyrococcus 
sp« 


5 ' -AAAATCCTGGCAAACAGCTATTAT*GGGTAT 
3 ' -TTTTAGGACCGTTTGTCGAT* AATACCCATA 


20 


Pl-Tful 


Thermococcus 

fumicolans 

ST557 


5 ' -TAGATTTTAGGT^CGCTATATCCTTCC 
3 ' -ATCTAAAA^TCCAGCGATATAGGAAGG 


25 


PI-TfuII 


Thermococcus 

fumicolans 

ST557 


5 '-TAYGCNGAYACN'^GACGGYTTyT 
3 ' -ATRCGNCT^RTGNCTGCCRAARA 


PI-Thyl 


Thermococcus 
hydro-thermal - 
is 


5 ' -TAYGCNGAYACN^GACGGYTTYT 
3 ' -ATKCGMCT^RTGNCTGCCRAARA 


30 


PI-Tlil 


Thermococcus 
litoralis 


5 ' -TAYGCNGAYACNGACGG'^YTTyT 
3 ' -ATRCGNCTRTGNC^TGCCRAARA 




Pl-Tlill 


Thermococcus 
lit.oralis 


5 ' -AAATTGCTTGCAAACAGCTATTACGGCTAT 


35 


I-Tevi 


Bacteriophage 
T4 


5 ' -AGTGGTATCAAC^GCTCAGTAGATG 
3 • -TCACCATAGT'^TGCGAGTCATCTAC 






Bacteriophage 
T4 


5 ' -GCTTATGAGTATGAAGTGAACACGT'^TATTC 
3 ' -CGAATACTCATACTTCACTTGTG'^CAATAAG 


40 


F-TevI 


Bacteriophage 
T4 


5 ' -GAAACACAAGA" AATGTTTAGTAAANNNNNNNNNNNNNN 
3 ' -CTTTGTGTTCTTTACAAATCATTTNNNNNNNNNNNNNN'^ 




F-TevII 


Bacteriophage 
T4 


5 » -TTTAATCCTCGCTTCAGATATGGCAACTG 
3 '-TU^TTAGGAGCGA-^AGTCTATACCGTTGAC 


45 


Relatively 


small deviations (degenerations) of the recognition 



sequence which nevertheless make possible recognition and cleav- 
age by the particular DSBI enzyme are also included here. Such 
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deviations, also in connection with different basic conditions 
such as, for example, calcium or magnesium concentration, have 
been described (Argast GM et al. (1998) J Mol Biol 280:345-353). 
Core sequences of these recognition sequences are also included* 
5 It is known that the inner portions of the recognition sequences 
also suffice for an Induced double-strand break and that the out- 
er portions are not necessarily relevant but may contribute to 
determining the cleavage efficiency. Thus, for example, an 18bp 
core sequence can be defined for I-Scel. 

10 

Said DSBI recognition sequences may be localized in various posi- 
tions in or close to a marker protein gene and, for example when 
the marker protein used is a transgene, may already be incorpo- 
rated when constructing the marker protein expression cassette. 
Various possible localizations are illustrated by way of example 
in Figs. 2-A, 2-B, 3 and 5 and in the descriptions thereof. 



In a further advantageous embodiment, the insertion sequence com- 
2Q prises at least one homology sequence A which has a sufficient 
length and a sufficient homology to a sequence A' in the marker 
protein gene in order to ensure homologous recombination between 
A and A'. The insertion sequence is preferably flanked by two se- 
quences A and B which have a sufficient length and a sufficient 
2^ homology to a sequence A' and, respectively, B' in the marker 
protein gene in order to ensure homologous recombination between 
A and A' and, respectively, B and B' . 

"Sufficient length" means, with respect to the homology sequences 
30 A, A' and B, B', preferably sequences with a length of at least 
100 base pairs, preferably at least 250 base pairs, particularly 
preferably at least 500 base pairs, very particularly preferably 
at least 10 00 base pairs, most preferably of at least 2500 base 
pairs . 



35 



40 



45 



"Sufficient homology" means, with respect to the homology se- 
quences, preferably sequences whose homology to one another is at 
least 70%, preferably 80%, preferentially at least 90%, particu- 
larly preferably at least 95%, very particularly preferably at 
least 99%, most preferably 100%, over a length of at least 20 
base pairs, preferably at least 50 base pairs, particularly pre- 
ferably at least 100 base pairs, very particularly preferably at 
least 250 base pairs, most preferably at least 500 base pairs. 

Homology between two nucleic acids means the identity of the nu- 
cleic acid sequence over in each case the entire sequence length, 
which identity is calculated by way of comparison with the aid of 
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the GAP program algorithm (Wisconsin Package Version 10.0, Uni- 
versity of Wisconsin, Genetics Computer Group (GCG), Madison, 
USA), setting the following parameters: 

^ Gap Weight: 12 Length Weight: 4 

Average Match: 2,912 Average Mismatch : -2 , 003 

In a further preferred embodiment, the recombination efficiency 
is increased by a combination with processes which promote homol- 
ogous recombination. Such systems have been described and com- 
prise, by way of example, expression of proteins such as RecA or 
treatment with PARP inhibitors . It has been demonstrated that the 

15 intrachromosoroal homologous recombination in tobacco plants can 
be increased by using PARP inhibitors (Puchta H et al. (1995) 
Plant J 7:203-210). The use of these inhibitors can further in- 
crease the rate of homologous recombination in the recombinant 
constructs, after inducing the sequence-specific DKA double- 

20 strand break, and thus the efficiency of the deletion of the 
transgene sequences. Various PARP inhibitors may be used here. 
Preference is given to including inhibitors such as 3-amino 
benzamide, 8-hydroxy-2-methylquinazolin-4-one (NU1025) , l,llb-di- 
hydro-[2H]benzopyranot4,3,2-de]isoquinolin-3-one (GPI 6150) , 

25 5-aminoisoquinolinone, 3 , 4-dihydro-5-[ 4-( 1-piperidi- 

nyl)butoxy]-l(2H)-isoquinolinone or the substances described in 
WO 00/26192, WO 00/29384, WO 00/32579, WO 00/64878, WO 00/68206, 
WO 00/67734, WO 01/23366 and WO 01/23390. 

30 

Further suitable methods are the introduction of nonsense muta- 
tions into endogenous marker protein genes, for example by means 
of introducing RKA/DKA oligonucleotides into the plant (Zhu 
et al. (2000) Nat Biotechnol 18 ( 5 ): 555-558 ) . Point mutations may 
also be generated by means of DNA-RNA hybrids which are also 
known as •'chimeraplasty'' (Cole-Strauss et al* (1999) Nucl Acids 
Res 27 ( 5) : 1323-1330; Kmiec (1999) Gene therapy American Scientist 
87(3) :240-247) . 



40 



The methods of dsRNAi, cosuppression by means of sense RNA and 
VIGS (virus induced gene silencing) are also referred to as post- 
transcriptional gene silencing (PTGS). PTGS processes are partic- 
ularly advantageous because the demands on the homology between 
the marker protein gene to be reduced and the transgenically ex- 
pressed sense or dsRNA nucleic acid sequence are lower than, for 
example, in the case of a traditional antisense approach. Thus it 
is possible, using the marker protein nucleic acid sequences from 
one species, to effectively reduce also expression of homologous 
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marker protein proteins in other species, without it being abso- 
lutely necessary to isolate and to elucidate the structure of the 
marker protein homologues occurring there. Considerably less la- 
bor is therefore required • 

"Introduction" comprises within the scope of the invention any 
processes which are suitable for introducing an ''anti-marker pro- 
tein" compound r directly or indirectly, into a plant or a cell, 
compartment, tissue, organ or seeds of said plant or generating 
said compound there. The introduction may result in a transient 
presence of an "anti-marker protein" compound (for example a 
dsHNA or a recombinase) or else in a permanent (stable) presence. 



According to the different nature of the approaches described 
above, the "anti-marker protein" compound may exert its function 
directly (for example by way of insertion into an endogenous 
marker protein gene). However, said function may also be exerted 
indirectly after transcription into an RNA ( for example in anti- 
sense approaches) or after transcription and translation into a 
protein (for example in the case of recombinases or DSBT en- 
zymes). The invention comprises both directly and indirectly act- 
ing "anti-marker protein" compounds. 



25 Introducing comprises, for example, processes such as transfec- 
tion, transduction or transformation. 



"Anti-marker protein" compounds thus comprises, for example, also 
expression cassettes capable of implementing expression (i.e. 
30 transcription and, if appropriate, translation) of, for example, 
an MP-dsRNA, an KP-antisenseRNA, a sequence-specific recombinase 
or a DSBI enzyme in a plant cell. 



"Expression cassette" means within the scope of the present in- 
vention generally those constructions in which a nucleic acid se- 
quence to be expressed is functionally linked to at least one ge- 
netic control sequence, preferably a promoter sequence. 
Expression cassettes preferably consist of double-stranded DNA 
and may have a linear or circular structure. 

A functional linkage means, for example, the sequential arrange- 
ment of a promoter with a nucleic acid sequence to be transcribed 
(for example coding for an MP-dsRNA or a DSBI enzyme) and, if ap- 
propriate, further regulatory elements such as/ for example, a 
terminator and/or polyadenylation signals in such a way that each 
of the regulatory elements can fulfill its function during tran- 
scription of the nucleic acid sequence, depending on the arrange- 
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inent. of -the nucleic acid sequences. In this context, function can 
mean, for example, the control of expression, i.e. transcription 
and/or translation, of the nucleic acid sequence (e.g. coding for 
an MP-dsRNA or a DSBI enzyme). In this context, control com- 
5 prises, for exeonple/ initiating, increasing, controlling or sup- 
pressing the expression, i.e. transcription and, if appropriate, 
translation- This does not necessarily require a direct linkage 
in the chemical sense. Genetic control sequences such as, for ex- 
ample, enhancer sequences, may exert their function on the target 

10 sequence also from positions further afar or even from different 
DNA molecules. Preference is given to arrangements in which the 
nucleic acid sequence to be transcribed is positioned downstream 
of the sequence acting as promoter so that both sequences are co- 
valently connected to one another. The distance between the pro- 

15 moter sequence and the nucleic acid sequence to be expressed 

transgenically is here preferably less than 200 base pairs, par- 
ticularly preferably less than 100 base pairs, very particularly 
preferably less than 50 base pairs. 



20 



25 



30 



The skilled worker knows various ways of obtaining any of the ex- 
pression cassettes of the invention. An expression cassette of 
the invention is prepared, for example, preferably by direct fu- 
sion of a nucleic acid sequence acting as promoter to a nucleo- 
tide sequence to be expressed (e.g. coding for an MP-dsRNA or a 
DSBI enzyme) • A functional linkage may be produced by means of 
common recombination and cloning techniques, as are described, 
for example, in Maniatis T, Fritsch EF and Sambrook J (1989) Mo- 
lecular Clonings A Laboratory Manual, Cold Spring Harbor Labora- 
tory, Cold Spring Harbor, NY ajid in Silhavy TJ et al- (1984) Ex- 
periments with Gene Fusions, Cold Spring Harbor Laboratory, Cold 
Spring Harbor, NY and in Ausubel FM et al.(19B7) Current Proto- 
cols in Molecular Biology, Greene Publishing Assoc. and Wiley In- 
terscience. 

35 

The expression cassettes of the invention preferably comprise a 
promoter 5' upstream of the particular nucleic acid sequence to 
be expressed transgenically and a terminator sequence as an addi 
tional genetic control sequence 3' downstream and also, if ap- 
40 propriate, further customary regulatory elements, in each case 
functionally linked to the nucleic acid sequence to be expressed 
transgenically. 



The term "genetic control sequences" is to be understood broadly 
and means all those sequences which have an influence on the mak 
ing or function of the expression cassette of the invention. For 
example, genetic control sequences ensure transcription and, if 
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appropriate r translation in prokaryotic or eukaryotic organisms » 
Genetic control sequences are described, for example, in 
"Goeddel; Gene Expression Technology: Methods in Enzymology 185, 
Academic Press, San Diego, CA (1990)" or "Gruber and Crosby,, in: 
5 Methods in Plant Molecular Biology and Biotechnolgy , CRC Press, 
Boca Raton, Florida, eds.:Glick and Thompson, Chapter 7, 89-108" 
and in the references quoted there. 



Genetic control sequences comprise, in particular in plants, 
functional promoters. Preferred promoters suitable for the ex- 
pression cassettes are in principle any promoters capable of con- 
trolling expression of genes, in particular foreign genes, in 
plants . 

Plant-specific promoters or promoters functional in plants or in 
a plant cell means in principle any promoter capable of control- 
ling expression of genes, in particular foreign genes, in at 
least one plant or one part, cell, tissue, culture of a plant. In 
this context, expression may be, for example, constitutive, in- 
ducible or development-dependent. Preference is given to: 



a) Constitutive promoters 



30 



"Constitutive" promoters means those promoters which ensure 
expression in numerous, preferably all, tissues over a rela- 
tively large period of plant development, preferably at all 
points in time of plant development (Benfey et al.(198 9} EMBO 
J B : 2195-2202 > • Preference is given in particular to using a 
plant promoter or a promoter which is derived from a plajnt 
virus. Particular preference is given to the promoter of the 
35S transcript of the CaMV cauliflower mosaic virus (Franck 
et al. (1980) Cell 21:285-294? Odell et al, (1985) Nature 
313:810-812? Shewmaker et al. (1985) Virology 140:281-288; 
^5 Gardner et al. (1986) Plant Mol Biol 6:221- 228) or the 19S 

CaMV promoter (US 5,352,605? WO 84/02913? Benfey et al. 
(1989) EMBO J 8:2195-2202) and also to the promoter of the 
T^abidopsis thaliana nitrilase-1 gene (GenBank Acc. No.: 
Y07648, nucleotides 2456 (alternatively 2861) to 4308 or al- 
ternatively 4340 or 4344. (e.g. bp 2456 to 4340). 

Another suitable constitutive promoter is the rubisco small 
subunit (SSU) promoter (US 4,962,028), the leguminB promoter 
(GenBank Acc. No.: X03677), the promoter of the Agrobacterium 
nopaline synthase, the TR dual promoter, the Agrobacterium 
DCS (octopine synthase) promoter, the ubiquitin promoter 
(Holtorf S et al. (1995) Plant Mol Biol 29:637-649), the ubi- 
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quitin 1 proinoter (Christensen et al. (1992) Plant. Mol Biol 
18:675-689; Bruce et al. (1989) Proc Natl Acad Sci USA 
86:9692-9696), the Smas promoter, the cinnamyl alcohol dehy- 
drogenase proinoter (US 5,683,439), the promoters of the vacu- 
5 olar ATPase subunits or the promoter of a proline-rich pro- 

tein from wheat (WO 91/13991), and further promoters of genes 
whose constitutive expression in plants is known to the 
skilled worker. 

b) Tissue-specific promoters 

Preference is given to promoters with specificities for the 
anthers, ovaries, flowers, leaves, stems, roots or seeds. 

15 

Seed-specif ic promoters comprise, for example, the pro- 
moter of phaseolin (US 5,504,200; Bustos MM et al. (1989) 
Plant Cell 1 ( 9 ): 839-53 ) , of the 2S albumin ( Josef f son LG 
et al. (1987) J Biol Chem 262:12196-12201), of legumin 

20 (Shirsat A et al. (1989) Mol Gen Genet 215(2): 326-331), 

of USP (unknown seed protein; Baumlein H et al. (1991) 
Mol Gen Genet 225 ( 3 ): 459-67 ) , of napin (US 5,608,152; 
Stalberg K et al. (1996) L Planta 199:515-519), of the 
sucrose-binding protein (WO 00/26388), of legumin B4 

25 (LeB4; Baumlein H et al. (1991) Mol Gen Genet 225: 

121-128; Baeumlein et al. (1992) Plant Journal 
2(2):233-9; Fiedler U et al. (1995) Biotechnology (NY) 
13(10):1090f ), of oleosin (WO 98/45461) or of Bce4 (WO 
91/13980). Further suitable seed-specific promoters are 

30 those of the genes coding for the high molecular weight 

glutenin (HMWG) , gliadin, branching enzyme, ADP glucose 
pyrophosphatase (AGPase) or starch synthase. Preference 
is further given to promoters which allow seed-specific 
expression in monocotyledones such as corn, bairley, 

35 wheat, rye, rice, etc. promoters which may be employed 

advantageously are the promoter of the lpt2 or Iptl gene 
(WO 95/15389, WO 95/23230) and the promoters described in 
wo 99/16890 (hordein, glutelin, oryzin, prolamin, glia- 
din, zein, kasirin or secalin promoters). Further seed- 

40 specific promoters are described in WO 89/03887. 

Tuber-, storage-root- or root-specific promoters com- 
prise, for example, the class X patatin promoter (B33 ) or 
the promoter of the potato cathepsin D inhibitor. 

45 



Leaf -specif ic promoters comprise, for example, the pro- 
moter of the potato cytosolic FBPase (WO 97/05900), the 
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SSU promoter (small subunit) of rubisco (ribu- 
lose-l,5-bisphosphat:e carboxylase) or the potato ST-LSI 
promoter (Stockhaus et al. (1989) EMBO J 8 : 2445-2451 ) • 

Flower-specific promoters comprise, for example, the phy- 
toene synthase promoter (WO 92/16635) or the promoter of 
the P-rr gene (WO 98/22593). 

Anther-specific promoters comprise, for example, the 5126 
promoter (US 5,689,049, US 5,689,051), the glob-1 promot- 
er and the Y-zein promoter. 



c) Cheinically inducible promoters 

15 

Chemically inducible promoters allow expression control as a 
function of an exogenous stimulus (review article: Gatz et 
al, (1997) Ann Rev Plant Physiol Plant Mol Biol 48:89-108). 
Examples which may be mentioned are: the PRPl promoter (Ward 

20 et al. (1993) Plant Mol Biol 22:361-366), a salicylic acid- 

inducible promoter (WO 95/19443), a benzenesulf onsonide-induc- 
ible promoter (EP-A 0 388 186), a tetracycline-inducible pro- 
moter (Gatz et al. (1992) Plant J 2:397-404), an abscisic 
acid-inducible promoter (EP 0 335 528) and an ethanol- or 

25 cyclohexanone-inducible promoter (WO 93/21334). Also suitable 

is the promoter of the glutathione S-transf erase isoform II 
gene (GST-II-27), which may be activated by exogenous ly ap- 
plied safeners such as, for example, N ,N-diallyl-2 , 2-dichlo- 
roacetamide (WO 93/01294) and which is functional in numerous 

30 tissues of both monocotyledones and dicotyledones • 



Particular preference is given to constitutive or inducible pro- 
moters . 

Preference is further given to plastid-specif ic promoters for 
targeted expression in the plastids. Suitable promoters are de- 
scribed, for example, in WO 98/55595 or WO 97/06250. promoters 
which may be mentioned here are the rpo B promoter element, the 
atoB promoter element, the clpP promoter element (see also WO 
99/46394) and the 16SrDNA promoter element. Viral promoters are 
also suitable (WO 95/16783). 

Targeted expression in plastids may also be achieved by using, 
for example, a bacterial or bacteriophage promoter, introducing 
the resulting expression cassette into the plastid DNA and then 
expressing expression by means of a fusion protein of a bacterial 
or bacteriophage polymerase and a plastid transit peptide. US 
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5,925,806 describes an approprla-te process. 



Genet.ic control sequences further comprise also the 5 '-untrans- 
lated regions, intxons or noncoding 3' region of genes, such as, 
^ for example, the actin-1 intron, or the Adhl-S introns 1, 2 and 6 
(general overview: The Maize Handbook, Chapter 116, Freeling and 
Walbot, Eds«, Springer, New York (1994)). These sequences have 
been shown to be able to play a significant functions in the reg- 
ulation of gene expression. Thus it has been demonstrated that 
5 ' -untranslated sequences may increase transient expression of 
heterologous genes. They may further promote tissue specificity 
(Rouster J et al.(1998) Plant J. 15:435-440). As an example of 
translation enhancers, mention may be made of the 5' leader se- 
quence of the tobacco mosaic virus (Gallie et al. (1987) Nucl 
Acids Res 15:8693-8711). 

Polyadenylation signals suitable as control sequences are in par- 
ticular polyadenylation signals of plant genes and also Agxrobac- 

20 terium tumefaciens T-DNA polyadenylation signals. Examiples of 

particularly suitable terminator sequences are the CCS (octopine 
synthase) terminator and the NOS (nopaline synthase) terminator 
(Depicker A et al (1982) J Mol Appl Genet 1:561-573) and also the 
terminators of soybean actin, RUBISCO or alpha-amylase from wheat 

25 (Baulcombe DC et al (1987) Mol Gen Genet 209:33-40). 



Advantageously, the expression cassette may contain one or more 
"enhancer sequences" functionally linked to the promoter, which 
make increased transgenic expression of the nucleic acid sequence 
3^ possible. 



Genetic control sequences further means sequences coding for fu- 
sion proteins consisting of a signal peptide sequence. The ex- 
pression of a target gene is possible in any desired cell 
compartment, such as, for example, the endomembrane system, the 
vacuole and the chloroplasts . Desired glycosylation reactions, in 
particular foldings, and the like are possible by utilizing the 
secretory pathway. Secretion of the target protein to the cell 
surface or secretion into the culture medium, for example when 
using suspension-cultured cells or protoplasts, is also possible. 
The target sequences required for this may both be taken into ac- 
count in individual vector variations and be introduced into the 
vector together with the target gene to be cloned by using a 
suitable cloning strategy. Target sequences which may be used are 
both endogenous, if present, and heterologous sequences. Addi- 
tional heterologous sequences which are preferred for functional 
linkage but not limited thereto are further targeting sequences 
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for ensuring subcellular localization in the apoplast, in the 
vacuole, in plastids, in the mitochrondrion, in the endoplasmic 
reticulum (ER) , in the nucleus^ in elaioplasts or other compart- 
ments; and also translation enhancers such as the 5' leader se- 
5 quence from tobacco mosaic virus (Gallie et al. (1987) Nucl Acids 
Res 15: 8693-8711) and the like* The process of transporting pro- 
teins which are per se not located in the plastids specifically 
into said plastids has been described (Klosgen RB and Weil JH 
(1991) Mol Gen Genet 225 { 2 ): 297-304 ; Van Breusegem F et al. 
10 (1998) Plant Mol Biol 38( 3) :491-496) * 

Control sequences are furthermore understood to be those which 
make possible a homologous recombination or insertion into the 
genome of a host organism or allow the removal from the genome. 

^5 Methods such as the cre/lox technique allow the expression cas- 
sette to be removed tissue-specifically, possibly inducibly from 
the genome of the host organism (Sauer B. Methods. 1998? 
14 (4 ) r381-92 ) . Here, particular flanking sequences are attached 
to the target gene ( lox sequences), which make subsequent removal 

2^ by means of the ere recombinase possible. 

Preferably, the expression cassette, consisting of a linkage of 
the promoter to the nucleic acid sequence to be transcribed, may 
have been integrated into a vector and may be transferred into 
the plant cell or organism, for exsonple, by transformation, ac- 
cording to any of the processes described below. 



25 



''Transgenic" means preferably, for example with respect to a 
3Q transgenic expression cassette, a transgenic expression vector, a 
transgenic organism or to processes for transgenic expression of 
nucleic acids, all constructions brought about by genetic engi- 
neering methods or processes using said constructions, in which 
either 



35 



a) the nucleic acid sequence to be expressed. 



or 



40 



45 



b) the promoter functionally linked to the nucleic acid sequence 
to be expressed according to a), or 

c) (a) and (b) 

are not located in their natural, genetic environment (i.e. at 
their natural chromosomal locus) or have been modified by genetic 
engineering methods, the modification possibly being, for exam- 
ple, a substitution, addition, deletion, inversion or insertion 
of one or more nucleotide residues. Natural genetic environment 
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means the natural chromosomal locus in the source organism or the 
presence in a genomic library. 

''Transgenic" means r with respect to expression ("transgenic ex- 
^ pression"), preferably all expressions achieved using a transgen- 
ic expression cassette , transgenic expression vector or transgen- 
ic organism, according to the definitions indicated above. 



10 



15 



The DNA constructs employed within the scope of the process of 
the invention and the vectors derived therefrom may contain fur- 
ther functional elements. The term functional element is to be 
understood broadly and means all of those elements which influ- 
ence the preparation/ propagation or function of the DNA 
constructs or of vectors or organisms derived therefrom. Examples 
which may be mentioned without being limited thereto are: 

1 . Selection markers 

20 Selection markers comprise, for example, those nucleic acid or 

protein sequences whose expression gives to a cell, tissue or or- 
ganism an advantage (positive selection marker) or disadvantage 
(negative selection marker) over cells which do not express said 
nucleic acid or protein. Positive selection markers act, for ex- 

25 ample, by detoxifying a substance acting on the cell in an inhib- 
itory manner (e.g. resistance to antibiotics /herbicides) or by 
forming a substance which enables the plant to regenerate better 
or grow more under the chosen conditions (for excimple nutritive 
markers, hormone-producing markers such as ipt; see below). 

30 Another type of positive selection mareker comprises mutated pro- 
teins or RKAs which are not sensitive to a selective agent (e*g. 
16S rRNA mutants which are insensitive to spectinomycin) . Nega- 
tive selection markers act, for example r by catalyzing the forma- 
tion of a toxic substance in the transformed cells (e.g. the codA 

35 gene) • 

1.1 Positive selection markers: 

In order to further increase the efficiency, the DNA constructs 
may comprise additional positive selection markers. In a pre- 
ferred embodiment, the process of the invention may thus be car- 
ried out in the form of a dual selection in which a sequence cod- 
ing for a resistance to at least one toxin, antibiotic or 
herbicide is introduced together with the nucleic acid sequence 
to be inserted and selection is carried out additionally by using 
the toxin, antibiotic or herbicide. 
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Appropriate proteins and sequences of positive selection markers 
and also selection processes are familiar to the skilled worker. 
The selection marker imparts to the successfully tiransformed 
cells a resistance to a biocide (e.g. a herbicide such as phos-- 
5 phinothricin, glyphosate or bromoxynil)/ a metabolism inhibitor 
such as 2-deoxyglucose 6-phosphate (WO 98/4 5456) or an antibiotic 
such as, for example, tetracycline^ ampicillin, kanamycin, G 418, 
neomycin, bleomycin or hygromycin. Selection markers which may be 
mentioned by way of example are: 

10 

- phosphinothricin acetyltransf erases (PAT) which acetylate the 
free amino group of the glutamine synthase inhibitor phosphi- 
nothricin (PPT) and thus detoxify PPT (de Block et al. (1987) 
EMBO J 6:2513-2518) (also referred to as Bialophos® resist- 
ance gene (bar)). Corresponding sequences are known to the 
skilled worker (from streptomyces hygroscopicus GenBank Acc. 
No-: X17220 and X05822, from Streptomyces viridochromogenes 
GenBank Acc. No.: M 22827 and X65195; US 5,489,520). Further- 
more, synthetic genes have been described for expression in 
plastids* A synthetic PAT gene is described in Becker et al. 
(1994) Plant J 5:299-307. The genes impart a resistance to 
the herbicide Bialaphos or glufosinate and are frequently 
used markers in transgenic plants (Vickers JE et al. (1996) 
Plant Mol Miol Reporter 14:363—368; Thompson CJ et al. (1987) 
25 EMBO J 6:2519—2523), 



5-enolpyruvylshikimate 3 -phosphate synthases (EPSPS) which 
impart a resistance to glyphosate (N-(phosphonomethyl ) 
glycine). The molecular target of the unselective herbicide 
glyphosate is 5-enolpyruvyl-3-phosphoshikimate synthase 
(EPSPS). This enzyme has a key function in the biosynthesis 
of aromatic amino acids in microbes and plants but not in 
mammals (Steinrucken HC et al . (1980) Biochem Biophys Res 
Commun 94:1207—1212; Levin JG and. Sprinson DB (1964) J Biol 
Chem 239:1142-^1150; Cole DJ (1985) Mode of action of glypho- 
sate a literature analysis, p. 48—74. in: Grossbard E and At- 
kinson D (eds.). The herbicide glyphosate. Buttersworths, 
Boston.). Preference is given to using glyphosate- tolerant 
EPSPS variants as selection markers (Padgette SR et al. 
(1996). New weed control opportunities: development of soy- 
beans with a Roundup Ready^ gene. In: Herbicide Resistant 
Crops (Duke, S.O., ed.), pp. 53-84. CRC Press, Boca Raton/ 
FL; Saroha MK and Malik VS (1998) J Plant Biochemistry and 
Biotechnology 7:65—72). The EPSPS gene of Agrobacterium sp. 
strain CP4 has a natural tolerance for glyphosate, which can 
be transferred to appropriate transgenic plants. The CP4 
EPSPS gene was cloned from Agrobacterium sp. strain CP4 (Pad- 
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gette SR et al . (1995)Crop Science 35 ( 5 ) : 1451— 1461) • Se- 
quences of EPSPS enzymes which are glyphosate-tolerant have 
been described (inter alia in US 5,510,471; US 5^776,760? 
US 5,864,425; US 5,633,435; US 5,627;061; US 5,463,175; 
EP 0 218 571) » Further sequences are described under GenBank 
Acc. No: X63374 or M10947. 

Glyphosat^-degrading enzymes (gox gene; glyphosate oxidore- 
ductase). GOX (for example Achromobacter sp. glyphosate oxi~ 
doreductase) catalyzes the cleavage of a C~N bond in glypho- 
sate which is thus converted to aminomethylphosphonic acid 
(AMPA) and glyoxylate. GOX can thereby impart a resistance to 
glyphosate (Padgette SR et al. (1996) J Nutr 126 ( 3 ) : 7 02-16 ; 
Shah D et al. (1986) Science 233:478-481). 

The deh gene encodes a dehalogenase which inactivates 
Dalapon® (GenBank Acc. No.: AX022822, AX022820 and 
WO 99/27116) 

The bxn genes encode bromoxynil -degrading nitrilase enzymes 
(Genbank Acc. No: E01313 and J03196). 

Neomycin phosphotransferases impart a resistance to antibiot- 
ics (aminoglycosides) such as neomycin, G418, hygromycin, pa- 
romomycin or kanamycin by reducing the inhibiting action of 
said antibiotics by means of a phosphorylation reaction. Par- 
ticular preference is given to the nptll gene. Sequences can 
be obtained from GenBank (AF080390; AF0B0389). Moreover^ the 
gene is already part of numerous expression vectors and can 
be isolated therefrom using processes familiar to the skilled 
worker (AF234316; AF234315; AF234314). The NPTII gene encodes 
an aminoglycoside 3 '-O-phosphotransferase from E.coli, Tn5 

(GenBank Acc. No: U000G4 position 1401-2300; Beck et al, 

(1982) Gene 19 327-336). 

The DOG^l gene was isolated from the yeast Saccharomyces cer- 
evisiae (EP-A 0 807 836) and encodes a 2-deoxyglucose 6— phos- 
phate phosphatase which impeirts a resistance to 2-DOG 
(Randez-Gil et al. (1995) Yeast 11:1233-1240; Sanz et al. 
(1994) Yeast 10 : 119 5-1202 r GenBank Acc. No. s NC001140; posi- 
tion 194799-194056). 

Acetolactate synthases which impart a resistance to imidazo- 
linone/sulfonylurea herbicides (GenBank Acc. No.: X51514; 
Sathasivan K et al. (1990) Nucleic Acids Res- 18(8) :2 188); 
AB049a23; AF094326; X07645; X07644; A19547; A19546: A19545: 
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105376; 105373; AL133315) 



- Hygromycin phosphotransferases (e.g. GenBank Acc» No-: 
X74325) which impart a resistance to the antibiotic hygromy- 

S cin. The gene is part of numerous expression vectors and may 

be isolated therefrom using processes familiar to the skilled 
vorker (such as, for example, polymerase chain reaction) 
(GenBank Acc. No.: AF294981; AF234301; AF234300; AF234299? 
AF234298; AF354046; AF354045). 

10 

- Genes of resistance to 



15 



a) Chloramphenicol (chloramphenicol acetyltransferase) , 



b) tetracycline (inter alia GenBank Acc. No.: X65876; 

X51366)* Moreover, the gene is already part of numerous 
expression vectors and may be isolated therefrom using 
processes familiar to the skilled worker (such as, for 
20 example, polymerase chain reaction) 



c) Streptomycin (inter alia GenBank Acc. No.: AJ278607). 

d) Zeocin, the corresponding resistance gene is part of nu- 
merous cloning vectors (e.g. GenBank Acc. No.: L36849) 
and may be isolated therefrom using processes familiar to 
the skilled worker (such as, for example, polymerase 

' chain reaction). 

e) Ampicillin (B-lactamase gene; Datta N, Richmond MH 
(1966) Biochem J 98(l):204-9; Heffron F et al (1975) 
J. Bacteriol 122: 250—256; Bolivar F et al. (1977) 

Gene 2:95—114). The sequence is part of numerous cloning 
vectors and may be isolated therefrom using processes fa- 
miliar to the skilled worker (such as, for example, poly- 
merase chain reaction). 



Genes such as isopentenyl transferase from Agrobacterium tumefa— 
40 ciens ( strain:P022) (Genbank Acc. No.: AB025109) may also be used 
as selection markers. The ipt gene is a key enzyme of cytokinin 
biosynthesis. Its over expression facilitates the regeneration of 
plants (e.g. selection on cytokinin-f ree medium). The process for 
utilizing the ipt gene has been described (Ebinuma H et al. 
^5 (2000) Proc Natl Acad Sci USA 94:2117-2121; Ebinuma H et al. 

(2000) Selection of Marker-free transgenic plants using the onco- 
genes (ipt, rol A, B, C) of Agrobacterium as selectable markers. 
In Molecular Biology of Woody Plants- Kluwer Academic Publish- 
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ers ) • 

Various other positive selection markers which impart to the 
transformed plants a growth advantage over untrans formed plants 
and also processes for their use are described, inter alia, in 
EP-A 0 601 092. Examples which may be mentioned are p-glucuroni- 
dase (in connection with cytokinin glucuronide, for example), 
mannose 6-phosphate isomerase (in connection with mannose), UDP- 
galactose 4-epimerase (in connection with galactose, for exam- 



For a selection marker functional in plastids, particular prefer- 
ence is given to those which impart a resistance to spectinomy- 
cin, streptomycin, kanamycin, lincomycin, gentamycin^ hygromycin, 
methotrexat, bleomycin, phleomycin, blasticidin, sulfonamide, 
phosphinothricin, chlorsulfuron, bromoxymil, glyphosate, 2,4-da- 
trazine, 4-methyltryptophan, nitrate, S-aminoetbyl-L-cysteine , 
lysine/threonine, aminoethyl-cysteine or betainealdehyde . Partic- 
ular preference is given to the genes aadA, nptll , BADE, FLARE-S 
(a fusion of aadA and GFP, described in Khan MS & Maliga P (1999) 
Nature Biotech 17:910-915). Especially suitable is the aadA gene 
(Svab Z and Maliga P (1993) Proc Natl Acad Sci USA 90:913-917)* 
Modified 16S rDNA and also betainealdehyde dehydrogenase (BADH) 
25 from spinach have also been described (Daniell H et al. (2001) 
Trends Plant Science 6:237-239; Daniell H et al. (2001) Curr Ge- 
net 39:109-116; WO 01/64023; WO 01/64024; WO 01/64850). Lethal 
agents such as, for example, glyphosate may also be utilized in 
connection with correspondingly detoxifying or resistance enzymes 
(WO 01/81605) . 

The concentrations of the antibiotics, herbicides, biocides or 
toxins, which are used in each case for selection, must be adapt- 
ed to the particular test conditions or organisms- Examples which 
35 may be mentioned for plants are kanamycin (Km) 50 mg/L, hygromy- 
cin B 40 mg/Ii, phosphinothricin (Ppt) 6 mg/L, spectinomycin 
(Spec) 500 mg/L. 



2. Reporter genes 

40 

Reporter genes code for readily quantifiable proteins and thus 
ensure, via intrinsic color or enzyme activity, an evaluation of 
the transformation efficiency and of the location or time of ex- 
pression. In this context, very particular preference is given to 
genes coding for reporter proteins (see also Schenborn E, Grosk- 
reutz D (199 9) Mol Biotechnol 13(l):29-44) such as 
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green fluorescence protein (GFP) (Chui Wl. et al. (1996) Curr 
Biol 6:325^330; Leffel SM et al. (1997) Biotechniques 
23(5):912-8; Sheen et al. (1995) Plant J 8 ( 5 ) : 777-784 ; 
Haseloff et al. (1997) Proc Natl Acad Sci USA 94(6): 
2122-2127; Reichel et al. (1996) Proc Natl Acad Sci USA 
93(12) : 5888-5893; Tian et al . (1997) Plant Cell Rep 
16:267-271; WO 97/41228) 

chloramphenicol transferase 

lucif erase (Millar et al. (1992) Plant Mol Biol Rep 10: 
324-414; Ow et al. (1986) Science 234:856-859); allows biolu- 
minescence detection 

galactosidase (encodes an enzyxne for which various chromo- 
genic substrates cire available) 

j3-glucuronidase (GUS) (Jefferson et al. (1987) EMBO J 6: 
3901«3907) or the uidA gene (encode enzymes for which various 
chromogenic substrates are available) 

R-locus gene product which regulates production of anthocya- 
nin pigments (red color) in plant tissue and thus makes pos- 
sible a direct analysis of the promoter activity without 
addition of additional auxiliary substances or chromogenic 
substrates (Dellaporta et al. (1988) Ins Chromosome Structure 
and Function: Impact of New Concepts, 18^^ stadler Genetics 
Symposium, 11 :263-282 ) 

tyrosinase (Katz et al.(1983) J Gen Microbiol 129:2703- 
2714), enzyme which oxidizes tyrosine to give DOPA and dopa- 
quinone which consequently form the readily detectable mela- 
nine. 

aequorin (Prasher et al.(1985) Biochem Biophys Res Coramun 
126(3) : 1259-1268) , may be used in calcium-sensitive biolu- 
minescence detection. 

3. Origins of replication which ensure propagation of the ex- 
pression cassettes or vectors of the invention, for example 
in E. coli. Examples which may be mentioned are OKI (origin 
of DNA replication), the pBR322 ori or the P15A ori ( Sambrook 
et al.: Molecular Cloning. A Laboratory Manual, 2nd ed- Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 
1989) . 



PP 53790 



CA 02493364 2005-01-21 



58 

4. Elements r for example border sequences, which enable agrobac- 
teria-mediated transfer into plant cells for transfer and in- 
tegration into the plant genome , such aS/ for example, the 
right or left border of T-DNA or the vir region. 

5 

5. Multiple cloning regions (MCS) allow and facilitate the in- 
sertion of one or more nucleic acid sequences. 

20 Nucleic acid sequences (e.g. expression cassettes) may be 

introduced into a plant organism or cells, tissues, organs, parts 
or seeds thereof by advantageously using vectors which contain 
said sequences. Vectors may be, by way of example, plasmids, cos- 
mids, phages, viruses or else agrobacteria. The sequences may be 

25 inserted into the vector (preferably a plasmid vector) via suit- 
able restriction cleavage sites. The resulting vector may first 
be introduced into E. coli and amplified. Correctly transformed 
E- coli are selected, grown and the recombinant vector is ob- 
tained using methods familiar to the skilled worker. Restriction 

20 analysis and sequencing may serve to check the cloning step. 

Preference is given to those vectors which make possible a stable 
integration into the host genome. 

The preparation of a trans foirmed organism (or a transformed cell 
25 or tissue) requires that the corresponding DNA (e.g. the trans- 
formation vector) or RNA is introduced into the corresponding 
host cell. For this process which is referred to as transforma- 
tion (or transduction or transf ection) , a multiplicity of methods 
and vectors are available (Keown et al. (199 0) Methods in En- 
zymology 185:527-537; Plant Molecular Biology and Biotechnology 
(CRC Press, Boca Raton, Florida), Chapter 6/7, pp. 71-119 (1993); 
white FF (1993) Vectors for Gene Transfer in Higher Plants; in: 
Transgenic Plants, Vol. 1, Engineering and Utilization, Editors: 
Kung and Wu R, Academic Press, 15-38; Jenes B et al. (1993) Tech- 
35 niques for Gene Transfer, in: Transgenic Plants, Vol. 1, Engi- 
neering and Utilization, editors: Kung and R. Wu, Academic Press, 
pp. 128-143; PotrykuB (1991) Annu Rev Plant Physiol Plant Molec 
Biol 42:205-225; Halford NG, Shewry PR (2000) Br Med Bull 
56(1) :62-73) . 

40 

For example, the DNA or RNA may be introduced directly by micro- 
injection (WO 92/09696, WO 94/00583, EP-A 0 331 083, EP-A 0 175 
966) or by bombardment with DNA or RNA-coded microparticles 
(biolistic processes using the gene gun "particle bombardment"; 
US 5,100,792; EP-A 0 444 882; EP-A 0 434 616; Fromm ME et al. 
(1990) Bio/Technology 8(9):833-9; Gordon-Kamm et al- (1990) Plcint 
Cell 2:603). The cell may also be permeabilized chemically, for 
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example with polyethylene glycol, so as to enable the DNA to 
reach the cell by means of diffusion. The DNA may also take place 
by means of protoplast fusion to other DNA-containing units such 
as minicells, cells, lysosomes or liposomes (Freeman et al. 
5 (1984) Plant Cell Physiol. 29:1353ff; US 4,536,475). Electropora- 
tion is another suitable method for introducing DNA, in which, the 
cells are permeabilized reversibly by an electric impulse (EP-A 
290 395, WO 87/06614). Further processes comprise the calcium- 
phosphate -mediated transformation, DEAE-dextran-mediated trans- 
10 formation, the incubation of dry embryos in DNA-containing solu- 
tion or other methods of direct introduction of DNA (DE 4 005 
152, WO 90/12096, US 4,684,611). Appropriate processes have been 
described (e.g. in Bilang et al. (1991) Gene 100:247-250; Scheid 
et al. (1991) Mol Gen Genet 228:104-112; Guerche et al. (1987) 
15 Plant Science 52:111-116; Neuhause et al. (1987) Theor Appl Genet 
75:30-36; Klein et al. (1987) Nature 327:70-73; Howell et al, 
(1980) Science 208:1265; Horsch et al.(1985) Science 227:1229- 
1231; DeBlock et al. (1989) Plant Physiology 91:694-701; Methods 
for Plant Molecular Biology (Weissbach and Weissbach, eds.) Aca- 
20 demic Press Inc. (1988); and Methods in Plant Molecular Biology 
(Schuler and Zielinskij^ eds.) Academic Press Inc. (1989)). Physi- 
cal methods of introducing DNA into plant cells have been re- 
viewed by Oard (1991) Biotech Adv 9:1-11. 



In the case of these "direct" transformation methods, no particu- 
lar requirements are made on the plasmid used. It is possible to 
use simple plasmids such as those of the pUC series, pBR322, 
M13mp series, pACYC184 etc. 

30 

Besides these "direct" transformation techniques, transformation 
may also be carried out by bacterial infection by means of Agro- 
bacterium (e.g. EP 0 116 718), viral infection by means of viral 
vectors (EP 0 067 553; US 4,407,956; WO 95/34668; WO 93/03161) or 
35 by means of pollen (EP 0 270 356; WO 85/01856; US 4,684,611). 



Transformation is preferably carried out by means of agrobacteria 
which contain disarmed Ti-plasmid vectors, using the latters' 
natural ability to transfer genes to plants (EP-A 0 270 355; EP-A 
0 116 718). Agrobacterium transformation is widespread for trans- 
forming dicotyledones, but is also increasingly applied to mono- 
cotyledones (Toriyama et al. (1988) Bio/Technology 6: 1072-1074; 
Zhang et al. (1988) Plant Cell Rep 7:379-384; Zhang et al. (1988) 
Theor Appl Genet 76:835-840; Shimamoto et al. (1989) Nature 
338:274-276; Datta et al. (1990) Bio/Technology 8: 736-740; 
Christou et al. (1991) Bio/Technology 9:957-962; Peng et al. 
(1991) International Rice Research Institute, Manila, Philippines 
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563-574; Cao et al. (1992) Plant Cell Rep 11:585-591; Li et al. 
(1993) Plant Cell Rep 12:250-255; Rathore et al. (1993) Plant Mol 
Biol 21:871-884; Froinm et al- (1990) Bio/Technology 8:833-839; 
Gordon-Kamm et al. (1990) Plant Cell 2:603-618; D'Halluin et al. 
5 (1992) Plant Cell 4:1495-1505; Walters et al. (1992) Plant Mol 
Biol 18:189-200; Koziel et al - (1993) Biotechnology 11:194-200; 
Vasil IK (1994) Plant Mol Biol 25:925-937; Weeks et al, (1993) 
Plant Physiol 102:1077-1084; Somers et al. (1992) Bio/Technology 
10:1589-1594; WO 92/14828; Hiei et al. (1994) Plant J 6:271-282). 

10 

The strains most often used for agrobacterial transformation, 
Agrobacterium tumefaciens or Agrobacterium rhizogenes, contain a 
plasmid (Ti and Ri plasmids, respectively) , which is transferred 
to the plant after agrobacterial infection. Part of this plasmid, 
called T-DNA (transferred DNA) , is integrated into the genome of 
the plant cell. Alternatively, Agrobacterium may also transfer 
binary vectors (mini Ti plasmids) to plants and integrate them 
into the genome of said plants. 

20 

The application of Agrobacterium tumefaciens to the transforma- 
tion of plants, using tissue culture explants, has been described 
(inter alia, Horsch RB et al. (1985) Science 225:1229ff; Fraley 
et al. (1983) Proc Natl Acad Sci USA 80: 4803-4807; Bevans et al. 

25 (1983) Nature 304:184-187). Many Agrobacterium tumefaciens 

strains are capable of transferring genetic material, such as, 
for example, the strains EHAlOl [pEHAlOl ] , EHA105[pEHA1051 , 
LBA4404[pAL4404 J , C58Cl[pMP90] and C58C1 [ pGV2260 ] (Hood et al. 
(1993) Transgenic Res 2:208-218; Hoekema et al. (1983) Nature 

30 303:179-181; Koncz and Schell (1986) Gen Genet 204:383-396; De- 
blaere et al. (1985) Nucl Acids Res 13: 4777-4788). 



When using agrobacteria, the expression cassette must be inte- 
grated into special plasmids, either a shuttle or intermediate 
vector or a binary vector, when using a Ti or Ri plasmid for 
transformation, then at least the right border, but usually the 
right and left borders of the Ti or Ri plasmid T-DNA are con- 
nected as a flanking region to the expression cassette to be 
introduced. Preference is given to using binary vectors. Binary 
vectors may replicate both in £• coli and in agrobacteria and 
contain the components required for transfer into a plant system. 
They normally contain a selection marker gene for selection of 
transformed plants (e.g. the nptll gene which imparts a resist- 
ance to kanamycin) and a linker or polylinker flanked by the 
right and left T-DNA border sequences. They contain moreover, 
outside the T-DNA border sequence, also a selection marker which 
enables transformed E. coli and/or agrobacteria to be selected 
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(e.g. the nptlll gene which imparts a resistance to kanamycin) . 
Corresponding vectors may be transformed directly into Agrobac- 
terium (Holsters et al. (1978) Mol Gen Genet 163:181-187). 



Binary vectors are based, for example, on "broad host range" 
plasmids such as pRK252 (Bevan et al. (1984) Nucl Acid Res 
12,8711-8720) and pTJS75 (Watson et al. (1985) EMBO J 4(2):277- 
284). A large group of the binary vectors used is derived from 
pBINl9 (Bevan et al. (1984) Nucl Acid Res 12:8711-8720). 
Hajdukiewicz et al. developed a binary vector (pPZP) which is 
smaller and more efficient than the previously customary vectors 
(Hajdukiewicz et al. (1994) Plant Mol Biol 25:989-994). Improved 
and particularly preferred binary vector systems for Agrobacter- 
ium-mediated transformation are described in WO 02/00900. 



The agrobacteria transformed with a vector of this kind may then 
be used in the known manner for transforming plants, in particu- 
lar crop plants such as, for example, oilseed rape, for example 

20 by bathing wounded leaves or leaf sections in an agrobacterial 
solution and subsequently culturing them in suitable media. The 
transformation of plants by agrobacteria has been described 
(White FF, Vectors for Gene Transfer in Higher Plants; in Trans- 
genic Plants, vol. 1, Engineering and Utilization, edited by S-D. 

25 Kung and R. Wu, Academic Press, 1993, pp. 15-38; Jenes B et 
al.(1993) Techniques for Gene Transfer, in: Transgenic Plants, 
Vol. 1, Engineering and Utilization, edited by S.D. Kung and 
R. Wu, Academic Press, pp. 128-143; Potrykus (1991) Annu Rev 
Plant Physiol Plant Molec Biol 42 i 205-225). Transgenic plants may 

30 be regenerated in the known manner from the transformed cells of 
the wounded leaves or leaf sections. 



Different explants, cell plants, tissues, organs, embryos, seeds, 
microspores or other unicellular or multicellular cellular struc- 
tures derived from a plant organism may be used for transforma- 
tion. Transformation processes adjusted to the particular ex- 
plants, cultures or tissues are known to the skilled worker. 
Examples which may be mentioned are: shoot internodes (Fry J et 
al. (1987) Plant Cell Rep. 6:321-325), hypocotyls (Radke SE et 
al. (19 88) Theor Appl Genet 75:685-694; Schroder M et al. (1994) 
Physiologia Plant 92: 37-46.; Stefanov I et al. (1994) Plant Sci. 
95:175-186? Weier et al. (1997) Fett/Lipid 99:160-165), cotyledo- 
nous petioles (Meloney MM et al. (1989) Plant Cell Rep 8:238-242; 
weier D et al. (1998) Molecular Breeding 4:39-46), microspores 
and proembryos (Pechnan (1989) Plant Cell Rep- 8:387-390) and 
flower stalks (Boulter ME et al. (1990) Plant Sci 70:91-99; 
Guerche P et al- (1987) Mol Gen Genet 206:382-386). In the case 
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of a direct gene transfer, mesophyll protoplasts (Chapel PJ & 
Glimelius K (1990) Plant Cell Rep 9: 105-108; Golz et al . (1990) 
Plant Mol Biol 15:475-483) or else hypocotyl protoplasts 
(Bergmann P & Glimelius K (1993) Physiologia Plant 88:604-611) 
5 and microspores (Chen et al. (1994) Theor Appl Genet 

88:187-192; Jones villeneuve E et al. (1995) Plant Cell Tissue and 
Organ Cult 40:97-100) and shoot sections (Seki M et al. (1991) 
Plant Mol Biol 17:259-263) may be employed successfully. 

Stably transformed cells , i.e. those which contain the introduced 
DNA integrated into the DNA of the host cell^ may be selected 
from untransf ormed cells by using the selection process of the 
invention. The plants obtained may be grovm and crossed in the 
usual way. Preferably, two or more generations should be cultured 
in order to ensure that the genomic integration is stable and can 
be inherited. 

As soon as a transformed plant cell has been prepared, it is pos- 
20 sible to obtain a complete plant by using proceses known to the 
skilled worker. This involves, for example, starting from callus 
cultures, individual cells (e.g. protoplasts) or leaf disks 
(Vasil et al. (1984) Cell Culture and Somatic Cel Genetics of 
Plants, Vol I, II and III, Laboratory Procedures and Their Ap- 
25 plications. Academic Press; Weissbach and Weissbach (1989) Meth- 
ods for Plant Molecular Biology, Academic Press). It is possible 
to induce from these still undifferentiated callus cell masses 
the formation of shoot and root in the known manner. The seed- 
lings obtained may be planted out and grown. Appropriate pro- 
30 cesses have been described (Fennell et al. (1992) Plant Cell Rep. 
11: 567-570; Stoeger et al . (1995) Plant Cell Rep. 14:273-278; 
Jahne et al. (1994) Theor Appl Genet 89:525-533). 

The efficacy of expressing the transgenically expressed nucleic 
35 acids may be determined, for example, in vitro by shoot-meristem 
propagation using any of the selection methods described above. 
Moreover, changes in the type and level of expression of a target 
gene and the effect on the phenotype of the plant may be tested 
in greenhouse experiments using test plants. 

40 

The process of the invention is preferably used within the frame- 
work of plant biotechnology for generating plants having advanta- 
geous properties. The "nucleic acid sequence to be inserted" into 
the genome of the plant cell or the plant organism preferably 
comprises at least one expression cassette, said expression cas- 
sette being able to express, under the control of a promoter 
functional in plant cells or plant organisms, an RNA and/or a 
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protein which do not cause reduction of the expression,, amountr 
activity and/or function of a marker protein* but, particularly 
preferably, impart to the plant genetically altered in this way 
an advantageous phenotype. Numerous genes and proteins which may 
5 be used for achieving an advantageous phenotype, for example for 
the increase in quality of foodstuff or for producing particuleir 
chemicals or pharmaceuticals (Dunwell JM (2000) J Exp Bot 51 Spec 
No: 487-96) are known to the skilled worker. 



Thus it is possible to improve the suitability of the plants or 
the seeds thereof as foodstuff or feedstuff, for exmaple by al- 
tering the compositions and/or the content of metabolites, in 
particular proteins, oils, vitamins and/or starch, it is also 
possible to increase the growth rate, yield or resistance to 
biotic or abiotic stress factors. Advantageous effects may be 
achieved both by transgenic expression of nucleic acids or pro- 
teins and by targeted reduction of the expression of endogenous 
genes, with respect to the phenotype of the transgenic plant. The 
advantageous effects which may be achieved in the transgenic 
plant comprise, for example: 



increased resistance to pathogens (biotic stress) 

increased resistance to environmental factors such as heat, 
cold, frost, drought, UV light, oxidative stress, wetness, 
salt, etc. (abiotic stress) 

increased yield 



improved quality, for example increased nutritional value, 
increased storability 

The invention further relates to the use of the transgenic plants 
prepared according to the process of the invention and of the 
cells, cell cultures, plants or propagation material such as 
seeds or fruits derived from said plants, for preparing foodstuff 
or feedstuff, pharmaceuticals or fine chemicals such as, for ex- 
ample, enzymes, vitamins, amino acids, sugars, fatty acids, natu- 
ral and synthetic flavorings, aroma substances and colorants. 
Particular preference is given to the production of triacyl 
glycerides, lipids, oils, fatty acids, starch, tocopherols and 
tocotrienols and also carotenoids. Genetically modified plants of 
the invention, which may be consumed by humans and animals may 
also be used as foodstuff or feedstuff, for example, directly or 
after preparation known per se. 
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As already mentioned above, the process of the invention com- 
prises in a particularly advantageous einbodiment, in a process 
step downstream of the selection r the deletion of the sequence 
coding for the marker protein {e.g. mediated by recombinase or as 
5 described in WO03/004659) or the elimination by crossing and/or 
segregation of said sequences. (It is obvious to the skilled 
worker that, for this purpose, the nucleic acid sequence inte- 
grated into the genome and the sequence coding for the marker 
protein should have a separate chromosomal locus in the trans- 

10 formed cells. This, however, is the case in the majority of the 
resulting plants, merely for reasons of statistics). This proce- 
dure is particularly advantageous if the marker protein is a 
transgene which otherwise does not occur in the plant to be 
transformed. Although the resulting plant may still possibly con- 

15 tain the compound for reducing the expression, amount, activity 
and/or function of the marker protein, said compound would have 
no longer any "counterpart" in the fomn of said marker protein, 
and thus would have no effect. This is particularly the case if 
the marker protein is derived from a non-plant organism and/or is 

20 synthetic (for example the codA protein). It is^ however, also 
possible to use plant marker proteins from other plant species, 
which otherwise do not occur in the cell to be transformed (i.e. 
if not introduced as transgene). Said marker proteins are re- 
ferred to as "nonendogenous" marker proteins within the scope of 

25 the present invention. 



Very particularly advantageously, the compound for reducing the 
expression, amount, activity and/or function of the marker pro- 
tein is an RNA. After deletion or elimination by crossing/segre- 
gation, the resulting transgenic plant would have no longer any 
unnecessary (and, if appropriate, undesired) foreign protein. The 
sole foreign protein would be possibly the protein resulting from 
the nucleic acid sequence inserted into the genome. For reasons 
of product approval, this embodiment is particularly advanta- 
35 geous. As described above, said RNA may be an antisense RNA or, 
particularly preferably, a double- stranded RNA. It may be ex- 
pressed separately from the RNA coding for the target protein but 
also, possibly, on the same strand as the latter. 



45 



In summary, the particularly advantageous embodiment comprises 
the following features: 

A process for preparing transformed plant cells or organisms, 
which comprises the following steps: 



a) transforming a population of plant cells which comprises at 
least one non-endogenous (preferably non-plant) marker pro- 
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tein capable of converting directly or indirectly a substance 
X which is nontoxic for said population of plant cells into a 
substance Y which is toxic for said population, with at least 
one nucleic acid sequence to be inserted in combination with 
5 at least one nucleic acid sequence coding for a ribonucleic 

acid sequence capable of reducing the expression, amount, ac- 
tivity and/or function of said marker protein, and 

b) treating said population of plant cells with the substance X 
at a concentration which causes a toxic effect for nontrans- 
formed cells, due to the conversion by the marker protein, 
and 



c) selecting . transformed plant cells (and/or populations of 

plant cells, such as plant tissues or plants) whose genome 
contains said nucleic acid sequence and which have a growth 
advantage over no ntrans formed cells, due to the action of 
said compound, from said population of plant cells, the 
selection being carried out under conditions under which the 
marker protein can exert its toxic effect on the nontrans- 
formed cells, and 



d) regenerating fertile plants, and 

25 

e) eliminating by crossing the nucleic acid sequence coding for 
the marker protein and isolating fertile plants whose genome 
contains said nucleic acid sequence but does not contain any 
longer the sequence coding for the marker protein. 

30 

Sequences 



SEQ ID NO: 1 Nucleic acid sequence coding for E. coli cytosine 
35 deaminase (codA) 

SEQ ID NO: 2 amino acid sequence coding for E. coli cytosine 
deaminase (codA) 

SEQ ID NO: 3 Nucleic acid sequence coding for E. coli cytosine 
deaminase (codA), with modified start codon (GTG/ 
ATG) for expression in eukaryotes 



SEQ ID NO: 4 



Amino acid sequence coding for E. coli cytosine 
deaminase (codA), with modified start codon (GTG/ 
ATG) for expression in eukaryotes 
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SEQ ID NO; 5 
SEQ ID NO: € 

5 

SEQ ID NO: 7 

10 

SEQ ID NO: 8 
SEQ ID NO: 9 

15 

SEQ ID NO: 10 

20 

SEQ ID NO: 11 
SEQ ID NO: 12 

25 

SEQ ID NO: 13 

30 

SEQ ID NO: 14 
SEQ ID NO: 15 

35 

SEQ ID NO: 16 
SEQ ID NO: 17 



SEQ ID NO: 18 
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Nucleic acid sequence coding for S-trep-tomyces gri- 
seolus cytochrome P450-SU1 (suaC) 

Amino acid sequence coding for Streptomyces gri- 
seolus cytochrome P450-SU1 (suaC) 

Nucleic acid sequence coding for Agrobacterium tu- 
mefaciens indoleacetamide hydrolase (tms2} 

Amino acid sequence coding for Agrobacterium tume- 
faciens indoleacetamide hydrolase (tms2) 

Nucleic acid sequence coding for Agrobacterium tu- 
mefaciens indoleacetamide hydrolase (tms2) 

Amino acid sequence coding for Agrobacterium tume- 
faciens indoleacetamide hydrolase (tms2) 

Nucleic acid sequence coding for Xanthobacter au- 
totrophicus haloalkane dehalogenase (dhlA) 

Amino acid sequence coding for Xanthobacter auto- 
trophicus haloalkane dehalogenase (dhlA) 

Nucleic acid sequence coding for Herpes simplex 
Virus 1 thymidine kinase 

Amino acid sequence coding for Herpes simplex Vi- 
rus 1 thymidine kinase 

Nucleic acid sequence coding for Herpes simplex 
Virus 1 thymidine kinase 

Amino acid sequence coding for Herpes simplex Vi- 
rus 1 thymidine kinase 

Nucleic acid sequence coding for Toxoplasma gondii 
hypoxanthine-xanthine-guanine phosphoribosyl 
transferase 

Amino acid sequence coding for Toxoplasma gondii 
hypoxanthine-xanthine-guanine phosphoribosyl 
transferase 
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SEQ ID NO: 19 Nucleic acid sequence coding for E. coli xanthine- 
guanine phosphoribosyl transferase 

SEQ ID NO: 20 Amino acid sequence coding for £. coli xanthine- 
5 guanine phosphoribosyl transferase 

SEQ ID NO: 21 Nucleic acid sequence coding for E. coli xanthine- 
guanine phosphoribosyl transferase 

10 

SEQ ID NO: 22 Amino acid sequence coding for E. coli xanthine- 
guanine phosphoribosyl transferase 

SEQ ID NO: 23 Nucleic acid sequence coding for E. coli purine 
^2 nucleoside phosphorylase (deoD) 

SEQ ID NO: 24 Nucleic acid sequence coding for £• coli purine 
nucleoside phosphorylase (deoD) 

20 SEQ ID NO: 25 Nucleic acid sequence coding for Burkholderia ca- 

ryophylli phosphonate roonoester hydrolase (pehA) 

SEQ ID NO: 26 Amino acid sequence coding for Burkholderia caryo- 
phylli phosphonate monoester hydrolase (pehA) 

25 

SEQ ID NO: 27 Nucleic acid sequence coding for Agrobacterium 
rhizogenes tryptophan oxygenase (auxl) 

2Q SEQ ID NO: 28 Amino acid sequence coding for Agrobacterium rhi- 
zogenes tryptophan oxygenase (auxl) 

SEQ ID NO: 29 Nucleic acid seuence coding for Agrobacterium rhi 
zogenes indoleaceteunide hydrolase (aux2) 

35 

SEQ ID NO: 30 Amino acid seuence coding for Agrobacterium rhizo 
genes indoleacetamide hydrolase (aux2) 

SEQ ID NO: 31 Nucleic acid sequence coding for Agrobacterium tu 
40 mefaciens tryptophan oxygenase (auxl) 

SEQ ID NO: 32 Amino acid sequence coding for Agrobacterium tume 
faciens tryptophan oxygenase (auxl) 

45 

SEQ ID NO: 33 Nucleic acid sequence coding for Agrobacterium tu 
mefaciens indoleacetamide hydrolase (aux2) 
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SEQ ID NO: 34 
SEQ ID NO: 35 

5 

SEQ ID KO: 36 

10 

SEQ ID NO: 37 
SEQ ID NO: 38 

15 

SEQ ID NO: 39 
20 SEQ ID NO: 40 
SEQ ID NO: 41 

25 

SEQ ID NO: 42 
SEQ ID NO: 43 
SEQ ID NO: 44 

35 

SEQ ID NO: 45 
SEQ ID NO: 4 6 

40 

SEQ ID NO: 47 

45 

SEQ ID NO: 48 
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Amino acid sequence coding for Agrobacteri\iiii tume- 
faciens indoleacetamide hydrolase (aux2) 

Nucleic acid sequence coding for Agrobacterium vi- 
tis indoleacetamide hydrolase (aux2) 

Amino acid sequence coding for Agrobacterium vitis 
indoleacetamide hydrolase (aux2) 

Nucleic acid sequence coding for Arabidopsis thai- 
iana 5-inethylthioribose kinase (mtrK) 

Amino acid sequence coding for Arabidopsis thalia- 
na 5-methylthioribose kinase (mtrK) 

Nucleic acid sequence coding for Klebsiella pneu- 
moniae 5-methylthioribose kinase (mtrK) 

Amino acid sequence coding for Klebsiella pneumo- 
niae 5-methylthioribose kinase (mtrK) 

Nucleic acid sequence coding for Arabidopsis thal- 
iana alcohol dehydrogenase (adh) 

Amino acid sequence coding for Arabidopsis thalia- 
na alcohol dehydrogenase (adh) 

Nucleic acid sequence coding for Hordeum vulgare 
(barley) alcohol dehydrogenase (adh) 

Amino acid sequence coding for Hordeum vulgare 
(barley) alcohol dehydrogenase (adh) 

Nucleic acid sequence coding for Oryza sativa 
(rice) alcohol dehydrogenase (adh) 

Amino acid sequence coding for Oryza sativa (rice) 
alcohol dehydrogenase (adh) 

Nucleic acid sequence coding for Zea mays (corn) 
alcohol dehydrogenase (adh) 

Amino acid sequence coding for Zea mays (corn) al- 
cohol dehydrogenase (adh) 
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SEQ ID NO: 49 Nukleic acid sequence coding for a sense RNA frag- 
ment of E. coli cytosine deaminase (codARNAi- 
sense) 

5 SEQ ID NO: 50 Oligonucleotide primer codAS 'Hindlll 

5 ' -AAGCTTGGCTAACAGTGTCGAATAACG-3 ' 

SEQ ID KO: 51 Oligonucleotide primer codA3'SalI 
5 ' -GTCGACGACAAAATCCCTTCCTGAGG-3 ' 

10 

SEQ ID NO: 52 Nucleic acid sequence coding for an antisense RNA 
fragment of E. coli cytosine deaminase (codARNAi- 
anti) 

15 

SEQ ID NO: 53 Oligonucleotide primer codA5'EcoRI 
5 ' -GAATTCGGCTAACAGTGTCGAATAACG-3 ' 

SEQ ID NO: 54 Oligonucleotide primer codAS'BamHI 
20 5 ' -GGATCCGACAAAATCCCTTCCTGAGG-3 ' 

SEQ ID NO: 55 Vector construct pBluKS-nitP-STLSl-35S-T 

SEQ ID NO: 56 Expression vector pSON-1 

25 

SEQ ID NO: 57 Transgenic expression vector pSUN-l-codA-RNAi 



30 



SEQ ID NO: 58 Transgenic expression vector pSUNl-codA-RNAi- 
At . Act . -2 -At . Al S -R-OC ST 

SEQ ID NO: 59 NuXleic acid sequence coding for 5-methylthiori- 
bose kinase (mtrK) from corn (Zea mays); fragment 

35 SEQ ID NO: 60 Amino acid sequence coding for 5-methylthioribose 

kinase (mtrK) from corn (Zea mays); fragment 

SEQ ID NO: 61 Nucleic acid sequence coding for 5-methylthiori- 
bose kinase (mtrK) from oilseed rape (Brassica na- 
*0 pus ) , fragment 

SEQ ID NO: 62 Amino acid sequence coding for 5-methylthioribose 
kinase (mtrK) from oilseed rape (Brassica napus), 
fragment 



45 



SEQ ID NO: 63 Nucleic acid sequence coding for 5-methylthiori- 
bose kinase (mtrK) from oilseed rape (Brassica na 
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70 

pus), fragment 

SEQ ID NO: 64 Amino acid sequence coding for 5-methylthioribose 
kinase (mtrK) from oilseed rape (Brassica napus), 
fragment 

SEQ ID NO: 65 Nucleic acid sequence coding for 5-methylthiori- 
bose kinase (mtrK) from rice (Oryza sativa) , frag 
ment 

SEQ ID NO: 66 Amino acid sequence coding for 5-methylthioribose 
kinase (mtrK) from rice (Oryza sativa)^ fragment 

15 SEQ ID NO: 67 Nucleic acid sequence coding for 5-methylthiori- 

bose kinase (mtrK) from soybean (Glycine max), 
fragment ^ 

20 Amino acid sequence coding for 5-methylthioribose 

kinase (mtrK) from soybean (Glycine max), fragment 

SEQ ID NO: 69 Oligonucleotide primer codAS ' C-term 
5 ' -CGTGAATACGGCGTGGAGTCG-3 ' 

25 

SEQ ID NO: 70 Oligonucleotide primer codA3'C-term 
5 ' -CGGCAGGATAATCAGGTTGG-3 ' 

SEQ ID NO: 71 Oligonucleotide primer 35sT 5' primer 
5 ' -GTCAACGTAACCAACCCTGC-3 ' 



35 



40 



45 
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Figures 



5 



10 



15 



20 



Fig.l: Inactivation of the marker protein gene by means of 
introducing a recombinase 

P : promoter 

MP: Sequence coding for a marker protein 

R1/R2: Recombinase recognition sequences 

R: Recombinase or sequence coding for 

recombinase. 

In a preferred embodiment , the marker protein gene is in- 
activated by introducing a sequence-specific recombinase. 
Preference is given to its expressing the recombinase, as 
depicted here, starting from an expression cassette. 

The marker protein gene is flanked by recognition se- 
quences for sequence-specific recombinases, with se- 
quences of said marker protein gene being deleted by 
introducing said recombinase and thus said marker protein 
gene being inactivated. 



Fig.2-A: Inactivation of the marker protein gene by the action of 
25 a sequence-specific nuclease 

P : promoter 

DS: Recognition sequence for targeted induction of 

DNA double-Strand breaks 
MP-DS-MP': Sequence coding for a marker protein, 

comprising a DS 
nDSs Inactivated DS 

E: Sequence-specific enzyme for targeted 

induction of DNA double-strand breaks 



35 



40 



45 



The marker protein gene may be established by a targeted 
mutation or deletion in the marker protein gene, for ex- 
ample by sequence-specific induction of DNA double-strand 
breaks at a recognition sequence for targeted induction 
of DNA double-strand breaks in or close to the marker 
protein gene (P-MP). The double-strand break may occur in 
the coding region or else the noncoding (such as, for ex- 
ample, the promoter) region, induces an illegitimate re- 
combination (nonhomologous DNA-end joining) and thus, for 
example, a shift in the reading frame of said marker pro- 
tein. 
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Fig.2-B: Inactivation of -the marker protein gene by the action of 
a sequence-specific nuclease 



P r promoter 
^ DS: Recognition sequence for targeted induction 

of DNA double-strand breaks 
MP: Sequence coding for a marker protein 

nDS: Inactivated DS 

E: Sequence-specific enzyme for targeted 

induction of DNA double -strand breaks 



The marker protein gene may be established by a targeted 
deletion by sequence-specific induction of more than one 
sequence-specific DNA double-strand break in or close to 
said marker protein gene. The double-strand breaks may 
occur in the coding region or else the noncoding (such 
as, for example, the promoter) region and induce a dele- 
tion in the marker protein gene. The marker protein gene 
2Q is preferably flanked by DS sequences and is completely 

deleted by the action of enzyme E. 

Fig. 3: Inactivation of the marker protein gene by inducing an 
intramolecular homologous recombination, due to the ac- 
25 tion of a sequence-specific nuclease 



A/A' : Sequences with a sufficient length and homolo- 

gy to one another, in order to recombine with 
one another as a consequence of the induced 
double-strand break 

P : promoter 

DS : Recognition sequence for targeted induction 

of DNA double-strand breaks 
MP: Sequence coding for a marker protein 

E5 Sequence-specific enzyme for targeted 

induction of DNA double-strand breaks 



The marker protein gene may be inactivated by a deletion 
by means of intramolecular homologous recombination. Said 
homologous recombination may be initiated by sequence- 
specific induction of dna double-strand breaks at a rec- 
ognition sequence for targeted induction of DNA double- 
strand breaks in or close to the marker protein gene. The 
homologous recombination occurs between the sequences A 
and A' which have a sufficient length and homology to one 
another in order to recombine with one another as a con- 
sequence of the induced double-strand break. The recom- 
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bination causes a deletion of essential sequences of the 
marker protein gene- 



10 



15 



Fig. 4: Inactivation of the marker protein gene by intermolecular 
homologous recombination 

A/A' : Sequences with a sufficient length and homolo- 

gy to one another in order to recombine with 
one another 

B/B': Sequences with a sufficient length and 

homology to one another in order to recombine 
with one another 
P: promoter 

I: nucleic acid sequence/gene of interest to be 

inserted 

MP; Sequence coding for a marker protein 



The marker protein gene (P-MP) may also be inactivated by 
20 a targeted insertion into the marker protein gene, for 

example by means of intermolecular homologous recombina- 
tion- In this context, the region to be inserted is 
flanked on its 5' and 3' ends by nucleic acid sequences 
(A' and B', respectively), which have a sufficient length 
25 and homology to corresponding flanking sequences of the 

marker protein (A and B, respectively) in order to make 
possible a homologous recombination between A and A' and 
B and B' . The recombination causes a deletion of essen- 
tial sequences of the marker protein gene. 

30 

Fig. 5: Inactivation of the marker protein gene by intermolecular 
homologous recombination due to the action of a sequence- 
specific nuclease 



A/A' : Sequences with a sufficient length and homolo- 

gy to one another in order to recombine with 
one another 

B/B': Sequences with a sufficient length and 

homology to one another in order to recombine 

An 

with one another 
p : promoter 

I: nucleic acid sequence/gene of interest to be 

inserted 

MP: Sequence coding for a marker protein 

DS: Recognition sequence for targeted induction 

of DNA double-strand breaks 
Es Sequence-specific enzyme for targeted 
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induction of DNA double-s-trand breaks 

The marker protein gene may also be inactivated by a tar- 
geted insertion into the marker protein gene, for example 
5 by means of intermolecular homologous recombination • The 

homologous recombination may be initiated by sequence- 
specific induction of DNA double-strand breaks at a rec- 
ognition sequence for targeted induction of DNA double* 
strand breaks in or close to the marker protein gene. In 

1^ this context, the region to be inserted is flanked at its 

5' and 3' ends by nucleic acid sequences (A' and B', re- 
spectively) which have a sufficient length and homology 
to corresponding flanking sequences of the marker protein 
gene (A and B, respectively) in order to make possible a 

15 homologous recombination between A and A' and B and B'* 

The recombination causes a deletion of essential se- 
quences of the marker protein gene. 

Fig. 6: Vector map for pBluKS-nitP-STI.Sl-35S-T (SEQ ID NO: 55) 

20 

NitP: promoter of the A- thaliana nitrilasel gene (Gen- 
Bank Acc. No.: Y07648.2, Hillebrand et al. (1996) Gene 



25 



170:197-200) 

STLS-l intron: intron of the potato ST-LSl gene (Vancan- 
neyt GF et al. (1990) Mol Gen Genet 220 ( 2 ): 245-250 ) . 



35S--Terro: Terminator of the 35S CaMV gene (cauliflower 
3Q mosaic virus; Franck et al. (1980) Cell 21:2 85-294). 

Cleavage sites of relevant restriction endonucleases are 
indicated with their particular cleavage position. 

35 Fig. 7: Vector map for the transgenic expression vector 
pSUK-l-COdA-RNAi (SEQ ID NO: 57) 

NitP: promoter of the A. thaliana nitrilasel gene (Gen- 
Bank Acc. No.: Y07648.2, Hillebrand et al. (1996) Gene 
*0 170:197-200) 

STLS-1 intron: intron of the potato ST-LSl gene (Vancan- 
neyt GF et al. (1990) Mol Gen Genet 220(2) :245-250) - 



45 



35S-Term: Terminator of the 35S CaMV gene (cauliflower 
mosaic virus; Franck et al. (1980) Cell 21:285-294). 
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codA-sense: Nucleic acid sequence coding for a sense RNA 
fragment of E. coli cytosine deaminase (codAKKAi- sense; 
SEQ ID NO: 49) 

codA-anti: Nucleic acid sequence coding for an antiisense 
RNA fragment of E. coli cytosine deaminase (codARNAi- 
anti; SEQ ID NO: 52) 

LB/RB: Left and, respectively, right boundaries of Agro- 
bacterium T-DNA 

Cleavage sites of relevant restriction endonucleases are 
indicated with their particular cleavage position. Fur- 
ther elements represent customary elements of a binary 
Agrobacterium vector (aadA; ColEl? repA) 

Fig. 8: Vector map for the transgenic expression vector 

pSUNl-COdA-RNAi-At.Act.-2-At.Als-R-OCsT (SEQ ID NO: 58) 

NitP: promoter of the A. thaliana nitrilasel gene (Gen- 
Bank Acc. No.: y07 648.2, Hillebrand et al . (1996) Gene 
170:197-200) 

STLS-1 intron: intron of the potato ST-LSl gene (Vancan- 
neyt GF et al. (1990) Mol Gen Genet 220 ( 2 ): 245-250 ) . 



15 



20 



30 



35 



40 



45 



35S-Term: Terminator of the 3 5S CaKV gene (cauliflower 
mosaic virus; Franck et al. (1980) Cell 21:285-294). 

codA-sense: Nucleic acid sequence coding for a sense RNA 
fragment of E. coli cytosine deaminase (codARNAi- sense; 
SEQ ID NO: 49) 

codA-anti: Nucleic acid sequence coding for an antisense 
RNA fragment of E. coli cytosine deaminase (codARNAi- 
anti; SEQ ID NO: 52) 

I.eft border /right border: Left and, respectively, right 
boundaries of Agrobacterium T-DNA 

Cleavage sites of relevant restriction endonucleases are 
indicated with their particular cleavage position. Fur- 
ther elements represent customary elements of a binary 
Agrobacterium vector (aadA; ColEl; repA) 
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Fig.9a-b: Sequence comparison of various S-methylthioribose (MTR) 
kinases from various organisms, in particular plant: or- 
ganisms. Sequences from Klebsiella pneumoniae, Clostxi- 
dium te-tani, Arabidopsis thaliana (A.thaliana) , oilseed 
5 rape (Brassica napus), soybean (Soy-l), rice (Oryza sati- 

va-1) and also -the consensus sequence (Consensus) are 
shown. Homologous regions can be readily deduced from the 
consensus sequence. 
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Exemplary embodlmentis 
General methods 

^ The chemical synthesis of oligonucleotides may be carried out, 
for example, in the known manner by using the phosphoamide method 
(Voetr Voet, 2«<* Edition, Wiley Press New York, pages 896-897). 
The cloning steps carried out within the scope of the present in- 
vention, such as, for example, restriction cleavages, agarose gel 
electrophoresis r purification of DNA fragments, transfer of nu- 
cleic acids to nitrocellulose and nylon membranes, linking of DNA 
fragments, transformation of E. coli cells, cultivation of 
bacteria, propagation of phages and sequence analysis of recombi- 
nant DNA, are carried out as described in Sambrook et al« (1989) 
Cold Spring Harbor Laboratory Press; ISBN 0-87969-309-6. The se- 
quencing of recombinant DNA molecules was carried out using a la- 
ser fluorescence DNA sequencer from ABX, according to the method 
of Sanger (Sanger et al. (1977) Proc Natl Acad Sci 
USA 74:5463-5467) • 

20 

Example 1: Preparation of codA fragments 

First, a truncated nucleic acid variant of the codA gene, modi- 
25 fied by the addition of recognition sequences of the restriction 
enzymes Hindlll and SalX, is prepared using the PGR technique. 
For this purpose, part of the codA gene (GeneBank Acc. No*: 
S56903; SEQ ID NO: 1) is amplified from the E. coli source organ- 
ism by means of the polymerase chain reaction (PGR) using a 
30 sense-specific primer (codA5 'Hindlll; SEQ ID NOr 50) and an anti- 
sense-specific primer (codA3'SalI; SEQ ID NO: 51). 

codA5' Hindi lit 5 ' -AAGCTTGGCTAACAGTGTCGAATAACG-3 ' (SEQ ID NO: 50) 

35 codAS'Sall: 5 ' -GTGGACGACAAAATCGCTTCCTGAGG-3 ' (SEQ ID NO: 51) 

The PGR was carried out in 50 yil reaction mixture which con- 
tained: 

— 2 ^il (200 ng) of coli genomic DNA 

40 

0,2 mM dATP, dTTP, dGTP, dCTP 
1.5 mM Mg(OAc)2 
<- 5 |ig of bovine serum albumin 

40 pmol of "codA5' Hindlll" primer 

- 40 pmol of "codAS'Sall" primer 

15 [xl of 3.3x rTth DNA Polymerase XLPuffer (PE Applied 
Biosystems) 



PF 53790 



CA 02493364 2005-01-21 



10 



78 

5U of rTth DNA Polymerase XL (PE Applied Biosystems) 

The PGR is carried out under the following cycle conditions: 

Step 1: 5 minutes 94**C (denaturation) 

Step 2s 3 seconds 94*'C 

Step 3s 1 minute 60''C (annealing) 

Step 4: 2 minutes 72°C (elongation) 

30 repeats of steps 2 to 4 



Step 5: 10 minutes 72^*0 (post elongation) 
Step 6 s 4**c (waiting loop) 

The amplicon (codARNAi- sense; SEQ ID NO: 49) is cloned using 
standard methods into the PGR cloning vector pGEM~T (Promega) - 
The identity of the amplicon generated is confirmed by sequencing 
20 using the M13F (-40) primer. 

Another truncated fragment of the codA gene, modified by the 
addition of recognition sequences of the restriction enzymes Eco- 
RI and BamHI, is amplified using a sense-specific primer 
25 (codAS'EcoRI; SEQ ID NOs 53) and an antisense-specif ic primer 
(codA3'BamHI? SEQ ID NOs 54). 

COdAB'EcoRI: 5 ' -GAATTCGGCTAACAGTGTCGAATAACG-3 ' (SEQ ID NOs 53) 

COdA3'BamHIs 5 ' -GGATCCGACAAAATCCCTTCCTGAGG-3 ' (SEQ ID NOs 54) 

The PGR was carried out in 50 fil reaction mixture which con- 
tained s 

35 — 2 nl (200 ng) of coll genomic DNA 

0,2 mM dATPr dTTP, dGTP, dCTP 

1.5 mM Mg(OAc)2 
- 5 ^ig of bovine serum albumin 
_ 40 pmol of "codAS'EcoRI" primer 

40 pmol of "codA3'BamHI'' primer 

15 ^Ll of 3.3x rTth DNA Polymerase XLPuffer (PE Applied 
Biosysteias ) 

5U of rTth DNA Polymerase XL (PE Applied Biosystems) 

45 

The PGR is carried out under the following cycle conditions s 
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Step 1: 5 minutes 94**C (denaturation) 
Step 2: 3 seconds 94**C 
Step 3: 1 minute 60**C (annealing) 
5 Step 4: 2 minutes 72'*C (elongation) 

30 repeats of steps 2 to 4 

Step 5: 10 minutes 72°C (post elongation) 
Step 6: 4*'C (waiting loop) 

The aroplicon (codARNAi-anti; SEQ ID NO: 52) is cloned using stan- 
dard methods into the PGR cloning vector pGEM-T ( Pr omega ) . The 
identity of the amplicon generated is confirmed by sequencing us- 
ing the M13F (-40) primer. 

Example 2 Preparation of the transgenic expression vector for 
expressing a codA double-stranded KNA 

20 

The codA fragments generated in example 1 are used for preparing 
a DNA construct suitable for expressing a double-stranded codA 
RNA (pSUN-codA-RNAi) . The construct is suitable for reducing the 
steady-state RNA level of the codA gene in transgenic plants and, 
25 as a result therefrom, suppressing codA gene expression by using 
the double-strand RNA interference (dsRNAi) technique. For this 
purpose, the codA RNAi cassette is first constructed in the plas- 
mid pBluKS-nitP-STLSl-3 5S-T and then, in a further cloning step, 
completely transferred to the pSUN-1 plasmid. 

30 

The vector pBluKS-nitP-STI.Sl-35S-T (SEQ ID NO; 55) is a deriva- 
tive of pBluescript KB (Stratagene) and contains the promoter of 
the A- thaliana nitrilasel gene (GenBank Acc. No.: Y07648.2, nu- 
cleotides 2456 to 4340, Hillebrand et al- (1996) Gene 
170:197-200), the STLS-1 intron (Vancanneyt GF et al. (1990) Mol 
Gen Genet 220(2) :245-250) , restriction cleavage sites flanking 
the intron on its 5' and 3' sides and enabling DNA fragments to 
be inserted in a directed manner, and the terminator of the 35S 
CaMV gene (cauliflower mosaic virus; Franck et al. (1980) Cell 
21:285-294). Using these restriction cleavage sites (Hindlll, 
Sail, EcoRI, BamHI), the fragments codARNAi-sense (SEQ ID NO: 49) 
and codARNAi-anti (SEQ ID NO: 52) are inserted into said vector, 
thereby producing the finished codA RNAi cassette. 



45 



For this purpose, the codA sense fragment (codARNAi- sense SEQ ZD 
NO: 49) is first excised from the pGEM-T vector, using the en- 
zymes Hindlll and Sail, isolated and ligated into the pBluKS- 
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nit:P-STIiSl-35S-T vector under standard conditions. This vector 
had previously been cleaved using the restriction enzymes Hindlll 
and SalX, Correspondingly positive clones are identified by ana- 
lytical restriction digest and sequencing. 

5 

The vector obtained (pBluKS-nitP-codAsense-STLSl-BSS-T) is di- 
gested using the restriction enzymese BamHI and EcoRI. The codA- 
anti fragment ( codARNAi-anti; SEQ ID NO: 52) is excised from the 
corresponding pGEM-T vector, using BajnHI and EcoRI/ isolated and 
ligated into the cut vector under standard conditions. Corre- 
spondingly positive clones which contain the complete codA-RNAi 
cassette (pBluKS-nitP-codAsense-STLSl-codAanti-35S-T) are identi- 
fied by analytical restriction digest and sequencing. 

15 

The codA-RNAi cassette is transferred into the pSUN-1 vector 
(SEQ ID NO: 56) by using the Sad and Kpnl restriction cleavage 
sites flanking the cassette. The resulting vector pSUNl -codA-RNAi 
(see Fig- 7; SEQ id NO: 57) is used for transforming transgenic 
A.thaliana plants which express an active codA gene (see below). 
The plant expression vector pSUN-1 is particularly suitable with- 
in the scope of the process of the invention, since it does not 
contain any other positive selection marker. 



The resulting vector, pSUNl-codA-RNAi, enables an artificial 
codA-dsRNA variant consisting of two identical nucleic acid el- 
ments which are separated by an intron and inverted to one anoth- 
er to be constitutively expressed. Transcription of this artifi- 
cial codA-dsRNA variant results in the formation of a 
double-stranded RNA molecule, owing to the complementarity of the 
inverted nucleic acid elements. The presence of this molecule in- 
duces the suppression of codA gene expression ( accummulation of 
RNA) by means of double-strand RNA interference. 



35 Example 4: Preparation of transgenic Arabldopls thaliana plants 

Transgenic Arabidopsis thaliana plants which express transgeni- 
cally the E. coli codA gene as a marker protein ("A. 
thaliana- [ codA] ") , were prepared as described (Kirik et al. 
(2000) EMBO J 19(20) :5562-6) . 



The A. thaliana- [codA] plants are transformed with an Agrobacter- 
ium tumefaciens strain (GV3101 (pMP90]) on the basis of a modi- 
fied vacuum infiltration method (Clough S & Bent A (1998) Plant J 
16(6) :735-43/ Bechtold N et al. (1993) CR Acad Sci Paris 
1144(2) :204-212) . The AgroJbacteriu/n tumefaciens cells used have 
previously been transformed with the DNA construct described 
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(pSUNl-cociA-RNAi) . In this way, double transgenic A. 
thaliana-[codA] plants are generated which express an artificial 
codA double-stranded RNA under the control of the constitutive 
nitrilasel promoter- Expression of the codA gene is suppressed as 
5 a consequence of the dsRNAi effect induced by the presence of 
this artificial codA-dsRNA. Said double transgenic plants may be 
identified owing to their regained ability to grow in the pres- 
ence of 5-f luorocytosine in the culture medium. 



Seeds of primary trans formants are selected on the basis of the 
regained ability to grow in the presence of 5-f luorocytosine. For 
this purpose r the Tl seeds of the primary transf ormants are laid 
out on selection medium containing 200 \xg/ml 5-f luorocytosine . 
These selection plates are incubated under long-day conditions 
(16 h of light, 21*'C/8 h of darkness, 18**C). Seedlings which de- 
velop normally in the presence of 5-f luorocytosine are separated 
after 7 days and transferred to new selection plates. These 
plates are incubated for another 14 under unchanged conditions - 
The resistant seedlings are then transplanted into soil and cul- 
tured under short-day conditions (8 h of light, 21**C/16 h of dark- 
ness, 18**C). After 14 days, the young plants are transferred to 
the greenhouse and cultured under short-day conditions. 



20 



25 



30 



35 



Example 5: Preparation of a plant transformation vector contain- 
ing an expression cassette for expressing a double- 
stranded codA RNA and a plant selection marker 

A plant selection marker consisting of a mutated variant of the 
A. thaliana Als gene, coding for the acetolactate synthase under 
the control of the promoter of the A. thaliana actin-2 gene 
(Meagher RB & Williamson RE (1994) The plant cytoskeleton. 
In The Plant Cytoskeleton (Meyerowitz, E- & Somerville, C, eds), 
pp. 1049-1084. Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, New York), and the octopine synthase terminator (GIELEN J 
et al.(1984) EMBO J 3:835-846) is inserted into pSUNl-codA-RNAi 
(see Fig. 7; SEQ ID NO: 57) (At .Act .-2-At .Als-R-ocsT) . 



40 



this purpose, the pSUNl-codA-RNAi vector is first linearized 
using the restriction enzyme Pvu II. Subsequently, a linear DNA 
fragment with blunt ends, coding for a mutated variant of the 
acetolactate synthase (Als-R gene), is ligated into said linea- 
rized vector under standard conditions. Prior to ligation, this 
DNA fragment has been digested with the restriction enzyme Kpnl 
and the protruding ends have been converted into blunt ends by 
treatment with Pwo DNA polymerase (Roche) according to the 
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manufacturer's instructions. This mutated variant of the A. thal- 
iana Als gene cannot be inhibited by herbicides of the imidazoli- 
none type. By expressing this mutated A.tAls-R gene, the plants 
obtain the ability to grow in the presence of the herbicide Pur- 
5 suit^. Correspondingly positive clones (pSUNl-codA-RNAi- 

At-Act.-2-At.Als-R-ocsT; SEQ ID NO: 57) are identified by analyt- 
ical restriction digest and sequencing. 

The vector obtained enables an artificial codA RNA variant (con- 
10 sisting of two identical nucleic acid elements which are sepa- 
rated by an intron and inverted to one another) and a mutated 
variant of the A. thaliana Als gene to be expressed constitutive- 
ly. Transcription of this artificial codA RNA variant results in 
the formation of a double- stranded RNA molecule, owing to the 
15 complementarity of the inverted nucleic acid elements. The pres- 
ence of this molecule induces the suppression of codA gene ex- 
pression (accummulation of RNA) by means of double-strand RNA in- 
terference- Expression of the Als-R gene imparts to the plants 
the ability to grow in the presence of herbicides of the imidazo- 
20 linone type. 



Example 6: Preparation of transgenic Arabidopis thaliana plants 

Transgenic Arabidopsis thaliana plants expressing the E. coli 
codA gene as a marker protein ( ''A. thaliana- [codA] " ) were prepared 
as described (Kirik et al.(2000) EMBO J 19 ( 20 ): 5562-6 ) . 

The A. thaliana-[codAl plants are transformed with an Agrobacter- 

3Q ium tumefaciens strain (GV3101 lpMP90] ) on the basis of a modi- 
fied vacuum infiltration method (Clough S & Bent A (1998) Plant J 
16 (6) :735-43; Bechtold N et al. (1993) CR Acad Sci Paris 
1144 (2 ) :204-212) . The Agrobacterium tumefaciens cells used have 
previously been transformed with the DNA construct described 

35 (pSUNl-codA-RNAi-At.Act .-2-At .Als-R-OCST; SEQ ID NO: 57) • In this 
way, double transgenic A- thaliana- [codA] plants are generated 
which additionally express an artificial codA double-stranded RNA 
and a herbicide-insensitive variant of the Als gene (Als-R) under 
the control of the constitutive nitrilasel promoter (A.thalia- 

40 na-[codA3-[codA-RNAi-At .Act.-2-At.Als-R-ocsT] ) . Expression of the 
codA gene is suppressed as a consequence of the dsRNAi effect in- 
duced by the presence of this artificial codA-dsRNA. These double 
transgenic plants may be identified owing to their regained abil- 
ity to grow in the presence of 5-f luorocytosine in the culture 

45 medium. In addition, positively transformed plants can be se- 
lected owing to their ability to grow in the presence of the her- 
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bicide Pursuit in "the culture medium. 



10 



15 



For the purpose of selection, the Tl seeds of primary transfer- 
mants are therefore laid out on selection medium containing 
100 [iq/ml 5-f luorocytosine - These selection plates are incubated 
under long-day conditions (16 h of light, 21^C/8 h of darkness, 
18°C), Seedlings which develop normally in the presence of 5-f luo- 
rocytosine are separated after 28 days and transferred to new 
selection plates. These plates are incubated for another 14 days 
under unchanged conditions. The resistant seedlings are then 
transplanted into soil and cultured under short-day conditions 
(8 h of light, 21*C/16 h of darkness, 1B**C) . After a further 14 
days, the young plants are transferred to the greenhouse and cul- 
tured under short-day conditions. 



In addition, seeds of the primary trans for mants , owing to their 
ability to grow in the presence of the herbicide Pursuit"^ , may be 
selected. It is furthermore possible to carry out dual selection 

20 using the herbicide Pursuit"^ and 5-f luorocytosine. For this pur- 
pose, the Tl seeds of primary transf ormants are laid out on 
selection medium containing the herbicide Pursuit ~ at a con- 
centration of 100 nM (in the case of dual selection, 100 \xg/ml 
5-f luorocytosine is "likewise present). These selection plates are 

25 incubated under long-day conditions (16 h of light, 21*'C/8 h of 
darkness, IB'^C). 



30 



35 



Seedlings which develop normally in the presence of Pursuit 
(Pursuit'*' and 5-f luorocytosine ) are separated after 2 8 days and 
transferred to new selection plates. These plates are incubated 
under unchanged conditions for another 14 days. The resistant 
seedlings are then transplanted into soil and cultured under 
short-day conditions (8 h of light, 21*'C/16 h of darkness, 18^C). 
After 14 days, the young plants are transferred to the greenhouse 
and cultured under short-day conditions. 



Example 7: Analysis of the double transgenic A. thaliana plants 
selected using 5-f luorocytosine and/or Pursuit 
40 (A. thaliana- [codA]-[codA-RNAi- At .Act .-2-At . Als- 

R-ocsT] ) 

Integration of the T-DNA region of the vector used for trans- 
formation, pSUNl-codA-RNAi-A.tAls-R, into the genomic DNA of the 
" starting plant (A.thaliana- tcodA] ) and the loss of codA-specif ic 
mRNA in these transgenic plants ( A. thaliana- (codA] -[ codA-RNAi- 
At.Act.-2-At.Als-R-ocsTJ ) can be detected by applying Southern 
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analyses and PGR t:echn±ques or Nor-thern analyses . 

In order to carry out said analyses, total RNA and DNA are iso- 
lated from leaf tissue of the transgenic plants and suitable con- 
^ trols (using the RNeasy Maxi Kit (RNA) and Dneasy Plant Maxi Kit 
(genomic DNA), respectively, according to the manufacturer's in- 
formation by Qiagen) . 

In the PGR analyses, the genomic DNA may be used directly as a 
basis (template) for the PGR. Total RNA is transcribed to cDNA 
prior to the PGR. The cDNA synthesis is carried out uising the re- 
verse transcriptase Superscript II (Invitrogen) according to the 
manufacturer * s information • 

Detection of the reduction in the steady— state amount 
of codA RNA in the positively selected double trans- 
genic plants (A.thaliana [codA] -[codA-RNAi- 
At . Act. -2-At . Als-R-ocsT] ) in comparison with the 
starting plants (A.thaliana [codA] ) used for trans- 
formation, by means of cDNA synthesis with subsequent 
PGR amplification. 

PGR amplification of the codA-specif ic cDNA: 

The CDNA of the codA gene (ACCESSION S56903) may be amplified us- 
25 ing a sense-specific primer . (codA5 'G- term SEQ ID NO: 69) and an 
antisense-specif ic primer ( codA3 'C-term SEQ ID NO: 70). The PGR 
conditions to be chosen aure as follows: 

The PGR was carried out in 50 ^il reaction mixture which con- 
30 tained: 

- 2 ^1 (200 ng) of cDNA from A.thaliana -[codA] or A.thaliana 
[ codA] - [ codA-RNAi-At .Act . -2 -At . Als-R-ocsT 1 plants 

0.2 mM dATP, dTTP, dGTP, dCTP 

35 

1.5 mM Mg(OAc)2 

- 5 H-g of bovine serum albumin 

40 pmol of codA5 'C-term SEQ ID NO: 69 
40 pmol of codA3' C-term SEQ ID NO: 7 0 

- 15 |xl of 3.3x rTth DNA Polymerase XLPuffer (PE Applied Bio- 
systems ) 

5U of rTth DNA Polymerase XL (PE Applied Biosystems) 

The PGR was carried out under the following cycle conditions: 

Step 1: 5 minutes 94'*G (denaturation) 

Step 2t 3 seconds 94**G 

Step 3: 1 minute Se^'G (annealing) 



Example 8: 



20 
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Step 4: 2 minutes 72''C (elongation) 

30 repeats of steps 2 to 4 

Step 5: 10 minutes 72*C (post elongation) 

Step 6: 4**C (waiting loop) 

5 

In the positively selected plants, the steady-state amount of the 
mRNA of the codA gene and the amount of CODA protein resulting 
therefrom is reduced so much that a quantitative conversion of 
5-f luorocytosine to 5~f luorouracil can no longer occur. Conse- 
3^Q quently, these plants (in contrast to the untransf ormed plants) 
can grow in the presence of 5-f luorocytosine. Thus it is demon- 
strated that transgenic plants can be identified owing to the ap- 
plied principle of preventing expression of a negative selection 
marlcer . 

15 

Example 9: Detection of the DNA coding for codA-RNAi by using 

genomic DMA of the positively selected double trans- 
genic plants (A.thaliana ( codA] -[ codA-RNAi- 
At . Act . -2-At . Als-R-ocsT] ) 

20 

The codA-RNAi transgene may be amplified using a codA-specif ic 
primer (e.g. codA5 ' Hindlll SEQ ID NO: 50) and a 35S terminator- 
specific primer (35sT 5' Primer SEQ ID NO: 71). Using this primer 
combination, it is possible to detect specifically only the DNA 
25 coding for the codA RNAi construct, since the codA gene which was 
already present in the starting plants (A.thaliana [codA]) used 
for transformation is flanked by the nos terminator. 



30 



The PGR conditions to be chosen are as follows: 

The PGR was carried out in a 50 |a1 reaction mixture which con- 
tains: 



2 |xl (200ng) of genomic DNA from the A.thaliana [codA] -[codA- 

RNAi-At . AC t . -2 -At . Als-R-ocsT 1 plants 
35 - 0.2 mM dATP, dTTP^ dGTP, dCTP 

1.5 mM Mg(0Ac)2 
- 5 Hg of bovine serum albumin 

40 pmol of codA-specif ic sense primer (SEQ ID NO: 50, 53 or 
40 69) 

40 pmol of 35sT 5' primer SEQ ID NO: 71 

15 \xl of 3.3x rTth DNA Polymerase XLPuffer (PE Applied Bio- 
systems ) 

5U of rTth DNA Polymerase XL (PE Applied Biosystems) 



45 



The PGR was carried out under the following cycle conditions ; 
Step 1: 5 minutes 94**C (denaturation) 
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Step 2: 3 seconds 94*^0 
•Step 3r 1 minute 56^*0 (annealing) 
Step 4: 2 minutes 72°C (elongation) 
30 repeats of steps 2 to 4 
5 Step 5: 10 minutes 72°C (post elongation) 
Step €s 4°C (waiting loop) 

In this way, it is possible to detect in the positively selected 
plants integration of the codA-RNAi DNA construct into the chro- 
10 mosomal DNA of the starting plants used for transformation. Thus 
it is demonstrated that transgenic plants can be identified owing 
to the applied principle of preventing expression of a negative 
selection marker. 

Example 10: Detection of the reduction in the steady-state amount 
of codA RNA in the positively selected double trans- 
genic plants (A.thaliana [codA] - [ codA-RNAi- 
At .Act.-2-At.Als-R-ocsT] ) in comparison with the 
starting plants (A.thaliana [codAJ) used for trans- 
formation, by Northern analysis. 



Gel-electrophoretic RNA fractionation: 

25 

For each RNA agarose gel, 3 g of agar are dissolved in 150 ml of 
H2O (f.c- 1.5% (w/v)) in a microwave oven and cooled to 60°C. The 
addition of 20 ml of lOx MEN (0.2 M MOPS, 50 mM sodium acetate, 
10 mM EDTA) and 30 ml of formaldehyde (f.c. 2.2 M) causes further 
cooling so that the well-mixed solution must be poured speedily. 

30 Formaldehyde prevents the formation of secondary structures in 
the RNA, and therefore the rate of migration is approximately 
proportional to the molecular weight (IjEHRBACH H et al . ( 1977) 
Biochem J 16: 4743-4751). The RNA samples are denatured, prior to 
application to the gel, in the following mixture: 20 jil of RNA 

35 (1-2 \xg/\Ll), 5 \xl of lOx MEN buffer, 6 \il of formaldehyde, 20 jjlI 
of f ormamide . 



The mixture is mixed and incubated at 65°C for 10 minutes. 1/10 
volume of sample buffer and 1 ^1 of ethidium bromide (10 mg/ml) 
are added and the sample is then applied- Gel electrophoresis is 
carried out in horizontal gels in Ix MEN at 120 V for two to 
three hours- After electrophoresis, the gel is photographed under 
UV light with the aid of a ruler for subsequent determination of 
the fragment length. This is followed by blotting the RNA to a 
nylon membrane according to the information in: SAMBROOK J et al. 
Molecular cloning: A laboratory manual- Cold Spring Harbor, New 
York, Cold Spring Harbor Laboratory Press, 1989. 
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10 



Radioactive labeling of DNA fragments and Northern hybridization 

The codA cDNA fragment (codARNAi-sense SEQ ID No: 49) can be la- 
beled using, for example, the High Prime kit sold by Roche Diag- 
nostics. The High Prime kit is based on the "random primed" meth- 
od for DNA labeling originally described by Feinberg and 
Vogelstein. Labeling is carried out by denaturing approx. 25 ng 
of DNA in 9-11 ^1 of H2O at 95**C for 10 min. After a short incuba- 
tion on ice, 4 |xl of High Prime solution (contains a random prim- 
er mixture, 4 units of Klenow polymerase and 0«125 inM dATP, dTTP 
and dGTP each in a reaction buffer containing 50% glycerol) and 
3-5 \il of [a32P]dCTP (30-50 ^iCi) are added. The reaction mixture 
is incubated at BT'^C for at least 10 min and the unincorporated 
dCTP is then separated from the now radiolabeled DNA by means of 
gel filtration via a Sephadex G-50 column. The fragment is subse- 
quently denatured at 95**C for 10 min and kept on ice until used. 
The following hybridization and preincubation buffers are used: 

20 

Hypo Hybond 

2 50 mM sodium phosphate buffer pH 7.2 
1 mM £DXA 
7% SDS (g/v) 
25 250 mM NaCl 

10 ixg/ml ssDNA 

5% polyethylene glycol (PEG) 6000 
4 0 % f or mamide 



15 



30 The hybridization temperature when using Hypo Hybond is 42*0 and 
the duration of hybridization is 16-24 h. The RNA filters are 
washed using three different solutions: 2 x SSC (300 mM NaCl; 
30 iriM sodium citrate) + 0.1% SDS, 1 x SSC + 0,1% SDS and 0.1 x 
SSC + 0.1% SDS. The duration and intensity of washing depend on 

35 the strength of the activity bond. After washing, the filters are 
sealed in plastic foil and an X-ray film (X-OMat, Kodak) is ex- 
posed overnight at -TO^C. The signal strength on the x-ray films 
is a measure of the amount of codA mRNA molecules in the total 
RNA bound on the membranes. Thus it is possible to detect the re- 

40 duction in codA mRNA in the positively selected plants compared 
to the starting plants used for transformation. 

In the positively selected plants, the steady-state amount of the 
mRNA of the codA gene and the amount of CODA protein produced re- 
suiting therefrom is reduced so much that a quantitative conver- 
sion of 5-f luorocytosine to 5-f luorouracil can no longer occur. 
Consequently, these plants (in contrast to the untransf ormed 
plants) can grow in the presence of 5-f luorocytosine. Thus it is 
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demonstrated that transgenic plants can be identified owing to 
the applied principle of preventing expression of a negative 
selection marker. 

Example 11: Summary of the results of "negative -negative" 
selection 

Transformation of the codA-transgenic Arabidopsis plants with the 
codA-dsRNA construct (pSUNl-codA-RNAi-At •Act.^2-At .Als-R-ocsT; 
SEQ ID NO: 57) results in a significantly increased number of 
double transgenic plants into whose genome the JRKAi construct has 
been successfully integrated, in the case of both single selec- 
tion (with 5-f luorocytosine alone) and dual selection (Pursuit^ 
and 5-f luorocytosine) (in each case in comparison with untrans- 
formed plants). The analysis by means of PGR (see above) confirms 
the double transgenic state for the majority of the plants gener- 
ated in this way. This successfully demonstrates the practicabil- 
ity of the present invention, i.e. the usedDility of repression of 
a negative marker for positive selection (more or less a "nega- 
tive-negative" selection) ♦ 
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SEQUENCE LISTING 

<110> BASF Plant. Sci.ence GmbH 

<120> Novel selec^on processes 

<130> PF53790-AT 

<140> 
<141> 

<160> 71 

<170> Patentin Ver. 2.1 

<210> 1 
<211> 1284 
<212> DNA 

<213> Escherichia coll 

<220> 

<221> CDS 

<222> (1)..(1281> 

<223> coding for cytoslne deaminase (codA) 

<400> 1 

gtg teg aat aac get tta caa aca att att aac gcc egg tta cca ggc 48 
Val Ser Asn Asn Ala I*eu Gin Thr lie lie Asn Ala Arg Lieu Pro Gly 
15 10 15 

gaa gag ggg ctg tgg cag att cat ctg cag gac gga aaa ate age gcc 96 
Glu Glu Gly Leu Trp Gin lie His Leu Gin Asp Gly Lys Xle Ser Ala 
20 25 30 

att gat gcg caa tec ggc gtg atg ccc ata act gaa aac age ctg gat 144 
lie Asp Ala Gin Ser Gly Val Met Pro lie Thr Glu Asn Ser Leu Asp 
35 40 45 

gcc gaa caa ggt tta gtt. ata ccg ccg tt:t gtg gag cca cat. att cac 192 
Ala Glu Gin Gly Leu Val lie Pro Pro Phe Val Glu Pro His lie His 
50 55 60 

ctg gac acc acg caa ace gcc gga caa ccg aac tgg aat cag tec ggc 240 
Leu Asp Thr Thr Gin Thr Ala Gly Gin Pro Asn Trp Asn Gin Ser Gly 
65 70 75 80 

acg ctg ttt gaa ggc att gaa cgc tgg gcc gag cgc aaa gcg tta tta 288 
Thr Leu Phe Glu Gly lie Glu Arg Trp Ala Glu Arg Lys Ala Leu Leu 
85 90 95 

acc cat gac gat gtg aaa caa cgc gca tgg caa acg ctg aaa tgg cag 336 
Thr His Asp Asp Val Lys Gin Arg Ala Trp Gin Thr Leu Lys Trp Gin 
100 105 110 

att gcc aac ggc att cag cat gtg cgt acc cat gtc gat gtt teg gat 384 
lie Ala Asn Gly lie Gin His Val Arg Thr His Val Asp Val Ser Asp 
115 120 125 

gca acg eta act gcg ctg aaa gca atg ctg gaa gtg aag cag gaa gtc 432 
Ala Thr Leu Thr Ala Leu Lys Ala Met Leu Glu Val Lys Gin Glu Val 
130 135 140 

gcg ccg tgg att gat ctg caa ate gtc gcc ttc cct cag gaa ggg att 480 
Ala Pro Trp lie Asp Leu Gin lie Val Ala Phe Pro Gin Glu Gly lie 
145 150 155 160 
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ttg teg tat ccc aac ggt gaa gcg ttg ctg gaa gag gcg tta cgc tta 528 
Leu Ser Tyr Pro Asn Gly Glu Ala Leu Leu Glu Glu Ala Leu Arg Leu 

165 170 175 

ggg gca gat gta gtg ggg gcg att ccg cat ttt gaa ttt acc cgt gaa 576 
Gly Ala Asp Val Val Gly Ala lie Pro His Phe Glu Phe Thr Arg Glu 

180 185 190 

tac ggc gtg gag teg ctg cat aaa acc ttc gcc ctg gcg caa aaa tac 624 
Tyr Gly Val Glu Ser Leu His Lys Thr Phe Ala Leu Ala Gin Lys Tyr 
195 200 205 

gac cgt etc ate gac gtt cac tgt gat gag ate gat gac gag cag teg 672 
Asp Arg Leu He Asp Val His Cys Asp Glu He Asp Asp Glu Gin Ser 
210 215 220 

cgc ttt gtc gaa acc gtt get gcc ctg gcg cac cat gaa ggc atg ggc 720 
Arg Phe Val Glu Thr Val Ala Ala Leu Ala His His Glu Gly Met Gly 
225 230 235 240 

gcg cga gtc acc gcc age eac acc acg gca atg cac tec tat aac ggg 768 
Ala Arg Val Thr Ala Ser His Thr Thr Ala Met His Ser Tyr Asn Gly 
245 250 255 

gcg tat acc tea cgc ctg ttc cgc ttg ctg aaa atg tec ggt att aac 816 
Ala Tyr Thr Ser Arg Leu Phe Arg Leu Leu Lys Met Ser Gly He Asn 
260 265 270 

ttt gtc gcc aac ccg ctg gtc aat att cat ctg caa gga cgt ttc gat 864 
Phe Val Ala Asn Pro Leu Val Asn He His Leu Gin Gly Arg Phe Asp 
275 280 285 

acg tat cca aaa cgt cgc ggc ate acg cgc gtt aaa gag atg ctg gag 912 
Thr Tyr Pro Lys Arg Arg Gly He Thr Arg Val Lys Glu Met Leu Glu 
290 295 300 

tec ggc att aac gtc tgc ttt ggt cac gat gat gtc ttc gat ccg tgg 960 
Ser Gly He Asn Val Cys Phe Gly His Asp Asp Val Phe Asp Pro Trp 
305 310 315 320 

tat ccg ctg gga acg gcg aat atg ctg caa gtg ctg cat atg ggg ctg 1008 
Tyr Pro Leu Gly Thr Ala Asn Met Leu Gin Val Leu His Met Gly Leu 
325 330 335 

cat gtt tgc cag ttg atg ggc tac ggg cag att aac gat ggc ctg aat 1056 
His Val Cys Gin Leu Met Gly Tyr Gly Gin He Asn Asp Gly Leu Asn 
340 345 350 

tta ate acc cac cac age gca agg acg ttg aat ttg cag gat tac ggc 1104 
Leu He Thr His His Ser Ala Arg Thr Leu Asn Leu Gin Asp Tyr Gly 
355 360 365 

att gee gcc gga aac age gcc aac ctg att ate ctg ccg get gaa aat 1152 
He Ala Ala Gly Asn Ser Ala Asn Leu He He Leu Pro Ala Glu Asn 
370 375 380 

ggg ttt gat gcg ctg cgc cgt cag gtt ccg gta cgt tat teg gta cgt 1200 
Gly Phe Asp Ala Leu Arg Arg Gin Val Pro Val Arg Tyr Ser Val Arg 
385 390 395 400 

ggc ggc aag gtg att gcc age aca caa ccg gca caa acc acc gta tat 1248 
Gly Gly Lys Val He Ala Ser Thr Gin Pro Ala Gin Thr Thr Val Tyr 
405 410 415 

ctg gag cag cca gaa gcc ate gat tac aaa cgt tga 1284 
Leu Glu Gin Pro Glu Ala He Asp Tyr Lys Arg 
420 425 
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<210> 2 
<211> 427 
<212> PRT 

<213> Escherichia coli 
<400> 2 

Val Ser Asn Asn Ala Leu Gin Thr lie lie Asn Ala Arg Leu Pro Gly 
15 10 15 

Glu Glu Gly Leu Trp Gin lie His Leu Gin Asp Gly Lys lie Ser Ala 
20 25 30 

He Asp Ala Gin Ser Gly Val Met Pro He Thr Glu Asn Ser Leu Asp 
35 40 45 

Ala Glu Gin Gly Leu Val He Pro Pro Phe Val Glu Pro His He His 
50 55 60 

Leu Asp Thr Thr Gin Thr Ala Gly Gin Pro Asn Trp Asn Gin Ser Gly 
65 70 75 80 

Thr Leu Phe Glu Gly He Glu Arg Trp Ala Glu Arg Lys Ala Leu Leu 
85 90 95 

Thr His Asp Asp Val Lys Gin Arg Ala Trp Gin Thr Leu Lys Trp Gin 
100 105 110 

He Ala Asn Gly He Gin His Val Arg Thr His Val Asp Val Ser Asp 
115 120 125 

Ala Thr Leu Thr Ala Leu Lys Ala Meli Leu Glu Val Lys Gin Glu Val 
130 135 140 

Ala Pro Trp He Asp Leu Gin He Val Ala Phe Pro Gin Glu Gly He 
145 150 155 160 

Leu Ser Tyr Pro Asn Gly Glu Ala Leu Leu Glu Glu Ala Leu Arg Leu 
165 170 175 

Gly Ala Asp Val Val Gly Ala He Pro His Phe Glu Phe Thr Arg Glu 
180 185 190 

Tyr Gly Val Glu Ser Leu His Lys Thr Phe Ala Leu Ala Gin Lys Tyr 
195 2O0 205 

Asp Arg Leu He Asp Val His Cys Asp Glu He Asp Asp Glu Gin Ser 
210 215 220 

Arg Phe Val Glu Thr Val Ala Ala Leu Ala His His Glu Gly Met Gly 
225 230 235 240 

Ala Arg Val Thr Ala Ser His Thr Thr Ala Met His Ser Tyr Asn Gly 
245 250 255 

Ala Tyr Thr Ser Arg Leu Phe Arg Leu Leu Lys Met Ser Gly He Asn 
260 265 270 

Phe Val Ala Asn Pro Leu Val Asn He His Leu Gin Gly Arg Phe Asp 
275 280 285 

Thr Tyr Pro Lys Arg Arg Gly He Thr Arg Val Lys Glu Met Leu Glu 
290 295 300 

Ser Gly He Asn Val Cys Phe Gly His Asp Asp Val Phe Asp Pro Trp 
305 310 315 320 

Tyr Pro Leu Gly Thr Ala Asn Met Leu Gin Val Leu His Met Gly Leu 
325 330 335 
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His Val Cys Gin Leu Met Gly Tyr Gly Gin He Asn Asp Gly Leu Asn 
340 345 350 

Leu He Thr His His Ser Ala Arg Thr Leu Asn Leu Gin Asp Tyr Gly 
355 360 365 

He Ala Ala Gly Asn Ser Ala Asn Leu He He Leu Pro Ala Glu Asn 
370 375 380 

Gly Phe Asp Ala Leu Arg Arg Gin Val Pro Val Arg Tyr Ser Val Arg 
385 390 395 40O 

Gly Gly Lys Val He Ala Ser Thr Gin Pro Ala Gin Thr Thr Val Tyr 
405 410 415 

Leu Glu Gin Pro Glu Ala He Asp Tyr Lys Arg 
420 425 



<210> 3 
<211> 1284 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: coding for 
cytosine deaminase (codA) 

<220> 

<221> niisc_f eature 
<222> (1)..(3) 

<223> mutation of GTG to AT6 start codon for expression 
in eukaryotic hosts 

<220> 

<221> CDS 

<222> (1)..(1281) 

<223> coding for cytosine deaminase (codA) 
<400> 3 

atg teg aat aac get tta caa aca att att aac gcc egg tta cca ggc 48 
Met Ser Asn Asn Ala Leu Gin Thr He He Asn Ala Arg Leu Pro Gly 
15 10 15 

gaa gag ggg ctg tgg cag att cat ctg cag gac gga aaa ate age gcc 96 
Glu Glu Gly Leu Trp Gin He His Leu Gin Asp Gly Lys He Ser Ala 
20 25 30 

att gat gcg caa tec ggc gtg atg ccc ata act gaa aac age ctg gat 144 
He Asp Ala Gin Ser Gly Val Met Pro He Thr Glu Asn Ser Leu Asp 
35 40 45 

gcc gaa caa ggt tta gtt ata ccg ccg ttt gtg gag cca cat att cac 192 
Ala Glu Gin Gly Leu Val He Pro Pro Phe Val Glu Pro His He His 
50 55 60 

ctg gac acc acg caa acc gcc gga caa ccg aac tgg aat cag tec ggc 240 
Leu Asp Thr Thr Gin Thr Ala Gly Gin Pro Asn Trp Asn Gin Ser Gly 
65 70 75 80 



acg ctg ttt gaa ggc att gaa cgc tgg gcc gag cgc aaa gcg tta tta 288 
Thr Leu Phe Glu Gly He Glu Arg Trp Ala Glu Arg Lys Ala Leu Leu 
85 90 95 
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acc cat gac gat. gtg aaa caa cgc gca tgg caa acg ctg aaa tgg cag 336 
Thr His Asp Asp Val I*ys Gin Arg Ala Trp Gin Thr Leu Lys Trp Gin 
100 105 110 

att gcc aac ggc att cag cat gtg cgt acc cat gtc gat gtt teg gat 384 
lie Ala Asn Gly lie Gin His Val Arg Thr His Val Asp Val Ser Asp 
115 120 125 

gca acg eta act gcg ctg aaa gca atg ctg gaa gtg aag cag gaa gtc 432 
Ala Thr I#eu Thr Ala Leu Lys Ala Met Leu Glu Val Lys Gin Glu Val 
130 135 140 

gcg ccg tgg att gat ctg caa ate gtc gcc ttc cct cag gaa ggg att 480 
Ala Pro Trp He Asp Leu Gin He Val Ala Phe Pro Gin Glu Gly He 
145 150 155 160 

ttg teg tat ecc aac ggt gaa gcg ttg ctg gaa gag gcg tta cgc tta 528 
Leu Ser Tyr Pro Asn Gly Glu Ala Leu Leu Glu Glu Ala Leu Arg Leu 
165 170 175 

ggg gca gat gta gtg ggg gcg att ccg cat ttt gaa ttt acc cgt gaa 576 
Gly Ala Asp Val Val Gly Ala He Pro His Phe Glu Phe Thr T^g Glu 
180 185 190 

tac ggc gtg gag teg ctg cat aaa acc ttc gcc ctg gcg caa aaa tac 624 
Tyr Gly Val Glu Ser Leu His Lys Thr Phe Ala Leu Ala Gin Lys Tyr 
195 200 205 

gac cgt etc ate gac gtt cac tgt gat gag ate gat gac gag cag teg 672 
Asp Arg Leu He Asp Val His Cys Asp Glu He Asp Asp Glu Gin Ser 
210 215 220 

cgc ttt gtc gaa acc gtt get gee ctg gcg cac cat gaa ggc atg ggc 720 
Arg Phe Val Glu Thr Val Ala Ala Leu Ala His His Glu Gly Met Gly 
225 230 235 240 

gcg cga gtc acc gcc age cac acc acg gca atg cac tec tat aac ggg 768 
Ala Arg Val Thr Ala Ser His Thr Thr Ala Met His Ser Tyr Asn Gly 
245 250 255 

gcg tat acc tea cgc ctg ttc cgc ttg ctg aaa atg tec ggt att aac 816 
Ala Tyr Thr Ser Arg Leu Phe Arg Leu Leu Lys Met Ser Gly He Asn 
260 265 270 

ttt gtc gcc aac ccg ctg gtc aat att cat ctg caa gga cgt ttc gat 864 
Phe Val Ala Asn Pro Leu Val Asn He His Leu Gin Gly Arg Phe Asp 
275 280 285 

acg tat cea aaa cgt cgc ggc ate acg cgc gtt aaa gag atg ctg gag 912 
Thr Tyr Pro Lys Arg Arg Gly He Thr Arg Val Lys Glu Met Leu Glu 
290 295 300 

tec ggc att aac gtc tge ttt ggt cac gat gat gtc ttc gat ccg tgg 960 
Ser Gly He Asn Val Cys Phe Gly His Asp Asp Val Phe Asp Pro Trp 
305 310 315 320 

tat ccg ctg gga acg gcg aat atg ctg caa gtg ctg cat atg ggg ctg 1008 
Tyr Pro Leu Gly Thr Ala Asn Met Leu Gin Val Leu His Met Gly Leu 
325 330 335 

eat gtt tge cag ttg atg ggc tac ggg cag att aac gat ggc ctg aat 1056 
His Val Cys Gin Leu Met Gly Tyr Gly Gin He Asn Asp Gly Leu Asn 
340 345 350 

tta ate ace cac cae age gea agg acg ttg aat ttg cag gat tac ggc 1104 
Leu He Thr His His Ser Ala Arg Thr Leu Asn Leu Gin Asp Tyr Gly 
355 360 365 
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att gcc gcc gga aac age gcc aac ctg att ate ctg ccg get gaa aat 1152 
lie Ala Ala Gly Asn Sei: Ala Asn Leu lie lie Leu Pro Ala Glu Asn 
370 375 380 

ggg ttt gat gcg ctg cgc cgt cag gtt ccg gta cgt tat teg gta cgt 1200 
Gly Phe Asp Ala Leu Arg Arg Gin Val Pro Val Arg Tyr Ser Val Arg 
385 390 395 400 

ggc ggc aag gtg att gcc age aca caa ccg gca caa acc acc gta tat 1248 
Gly Gly Lys Val He Ala Ser Thr Gin Pro Ala Gin Thr Thr Val Tyr 
405 410 415 

ctg gag cag cca gaa gcc ate gat tac aaa cgt tga 1284 
Leu Glu Gin Pro Glu Ala He Asp Tyr Lys Arg 
420 425 

<210> 4 
<211> 427 
<212> PRT 

<213> Artificial sequence 

<223> Description of the artificial sequence: coding for 
cytosine deaminase (codA) 

<400> 4 

Met Ser Asn Asn Ala Leu Gin Thr He He Asn Ala Arg Leu Pro Gly 
15 10 15 

Glu Glu Gly Leu Trp Gin He His Leu Gin Asp Gly Lys He Ser Ala 
20 25 30 

He Asp Ala Gin Ser Gly Val Met Pro He Thr Glu Asn Ser Leu Asp 
35 40 45 

Ala Glu Gin Gly Leu Val He Pro Pro Phe Val Glu Pro His He His 
50 55 60 

Leu Asp Thr Thr Gin Thr Ala Gly Gin Pro Asn Trp Asn Gin Ser Gly 
65 70 75 80 

Thr Leu Phe Glu Gly He Glu Arg Trp Ala Glu Arg Lys Ala Leu Leu. 

85 90 95 

Thr His Asp Asp Val Lys Gin Arg Ala Trp Gin Thr Leu Lys Trp Gin 
100 105 110 

He Ala Asn Gly He Gin His Val Arg Thr His Val Asp Val Ser Asp 
115 120 125 

Ala Thr Leu Thr Ala Leu Lys Ala Met Leu Glu Val Lys Gin Glu Val 
130 135 140 

Ala Pro Trp He Asp Leu Gin He Val Ala Phe Pro Gin Glu Gly He 
145 150 155 160 

Leu Ser Tyr Pro Asn Gly Glu Ala Leu Leu Glu Glu Ala Leu Arg Leu 
165 170 175 

Gly Ala Asp Val Val Gly Ala He Pro His Phe Glu Phe Thr Arg Glu 
180 185 190 

Tyr Gly Val Glu Ser Leu His Lys Thr Phe Ala Leu Ala Gin Lys Tyr 
195 200 205 

Asp Arg Leu He Asp Val His Cys Asp Glu He Asp Asp Glu Gin Ser 
210 215 220 



CA 02493364 2005-01-21 



PF 53790 



Arg Phe Val Glu Thr val Ala Ala I.eu Ala His His Glu Gly Met. Gly 
225 230 235 240 

Ala Arg Val Thr Ala Ser His Thr Thr Ala Met His Ser Tyr Asn Gly 
245 250 255 

Ala Tyr Thr Ser Arg Leu Phe Arg Leu Leu Lys Met Ser Gly lie Asn 
260 265 270 

Phe Val Ala Asn Pro Leu Val Asn lie His Leu Gin Gly Arg Phe Asp 

275 280 285 

Thr Tyr Pro Lys Arg Arg Gly lie Thr Arg Val Lys Glu Met Leu Glu 

290 295 300 

Ser Gly lie Asn Val Cys Phe Gly His Asp Asp Val Phe Asp Pro Trp 
305 310 315 320 

Tyr Pro Leu Gly Thr Ala Asn Met Leu Gin Val Leu His Met Gly Leu 
325 330 335 

His Val Cys Gin Leu Met Gly Tyr Gly Gin He Asn Asp Gly Leu Asn 
340 345 350 

Leu He Thr His His Ser Ala Arg Thr Leu Asn Leu Gin Asp Tyr Gly 

355 360 365 

He Ala Ala Gly Asn Ser Ala Asn Leu He He Leu Pro Ala Glu Asn 

370 375 380 

Gly Phe Asp Ala Leu Arg Arg Gin Val Pro Val Arg Tyr Ser Val Arg 
385 390 395 400 

Gly Gly Lys Val He Ala Ser Thr Gin Pro Ala Gin Thr Thr Val Tyr 
405 410 415 

Leu Glu Gin Pro Glu Ala He Asp Tyr Lys Arg 
420 425 



<210> 5 
<211> 1221 
<212> DNA 

<213> Streptomyces griseolus 

<220> 

<221> CDS 

<222> (1)..(1218) 

<223> coding for cytochrome P450-Sul (suaC) 
<400> 5 

atg acc gat acc gcc acg acg ccc cag acc acg gac gca ccc gcc ttc 48 
Met Thr Asp Thr Ala Thr Thr Pro Gin Thr Thr Asp Ala Pro Ala Phe 
15 10 15 

ccg age aac egg age tgt ccc tac cag tta ccg gac ggc tac gcc cag 96 
Pro Ser Asn Arg Ser Cys Pro Tyr Gin Leu Pro Asp Gly Tyr Ala Gin 
20 25 30 

etc egg gac acc ccc ggc ccc ctg cac egg gtg acg etc tac gac ggc 144 
Leu Arg Asp Thr Pro Gly Pro Leu His Arg Val Thr Leu Tyr Asp Gly 

35 40 45 

cgt cag gcg tgg gtg gtg acc aag cac gag gcc gcg cgc aaa ctg etc 19 2 
Arg Gin Ala Trp Val Val Thr Lys His Glu Ala Ala Arg Lys Leu Leu 
50 55 60 
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ggc gac ccc egg ctg tec tec aac egg acg gac gac aac ttc ccc gcc 240 
Gly Asp Pro Arg Leu Ser Ser Asn Arg Thr Asp Asp Asn Phe Pro Ala 
65 70 75 80 

acg tea ecg cgc ttc gag gcc gtc egg gag age ccg cag gcg ttc ate 288 
Thr Ser Pro Arg Phe Glu Ala Val Arg Glu Ser Pro Gin Ala Phe lie 
85 90 95 

ggc etg gac ccg ccc gag cac ggc acc egg egg egg atg acg ate age 336 
Gly Leu Asp Pro Pro Glu His Gly Thr Arg Arg Arg Met Thr lie Ser 
100 105 110 

gag ttc ace gtc aag egg ate aag ggc atg cgc ccc gag gtc gag gag 3 84 
Glu Phe Thr Val Lys Arg lie Lys Gly Met Arg Pro Glu Val Glu Glu 
115 120 125 

gtg gtg cac ggc ttc etc gac gag atg ctg gcc gee ggc ccg acc gcc 432 
Val Val His Gly Phe Leu Asp Glu Met Leu Ala Ala Gly Pro Thr Ala 

130 135 140 

gac ctg gtc agt cag ttc gcg ctg ccg gtg ccc tec atg gtg ate tgc 480 
Asp Leu Val Ser Gin Phe Ala Leu Pro Val Pro Ser Met Val lie Cys 
145 150 155 160 

cga etc etc ggc gtg ccc tac gcc gac cac gag ttc ttc cag gac gcg 528 
Arg Leu Leu Gly Val Pro Tyr Ala Asp His Glu Phe Phe Gin Asp Ala 
165 170 175 

age aag egg ctg gtg cag tec acg gac gcg cag age gcg etc acc gcg 576 
Ser Lys Arg Leu Val Gin Ser Thr Asp Ala Gin Ser Ala Leu Thr Ala 
180 185 190 

egg aac gac etc gcg ggt tac ctg gac ggc etc ate acc cag ttc cag 6 24 
Arg Asn Asp Leu Ala Gly Tyr Leu Asp Gly Leu He Thr Gin Phe Gin 

195 200 205 

ace gaa ccg ggc gcg ggc ctg gtg ggc get ctg gtc gcc gac cag ctg 672 
Thr Glu Pro Gly Ala Gly Leu Val Gly Ala Leu Val Ala Asp Gin Leu 

210 215 220 

gcc aac ggc gag ate gac cgt gag gaa ctg ate tec acc gcg atg ctg 720 
Ala Asn Gly Glu He Asp Arg Glu Glu Leu He Ser Thr Ala Met Leu 
225 230 235 240 

etc etc ate gcc ggc cac gag ace acg gcc teg atg ace tec etc age 768 
Leu Leu He Ala Gly His Glu Thr Thr Ala Ser Met Thr Ser Leu Ser 

245 250 255 

gtg ate ace ctg ctg gac cac ccc gag cag tac gcc gcc ctg cgc gcc 816 
Val He Thr Leu Leu Asp His Pro Glu Gin Tyr Ala Ala Leu Arg Ala 

260 265 270 

gac cgc age etc gtg ccc ggc gcg gtg gag gaa ctg etc cgc tac etc 864 
Asp Arg Ser Leu Val Pro Gly Ala Val Glu Glu Leu Leu Arg Tyr Leu 

275 280 2B5 

gcc ate gcc gac ate gcg ggc ggc cgc gtc gcc acg gcg gac ate gag 912 
Ala He Ala Asp He Ala Gly Gly Arg Val Ala Thr Ala Asp He Glu 

290 295 300 

gtc gag ggg cac etc ate egg gcc ggc gag ggc gtg ate gtc gtc aac 960 
Val Glu Gly His Leu He Arg Ala Gly Glu Gly Val He Val Val Asn 
305 310 315 320 

teg ata gee aac egg gac ggc acg gtg tac gag gac ccg gac gee etc 1008 
Ser He Ala Asn Arg Asp Gly Thr Val Tyr Glu Asp Pro Asp Ala Leu 
325 330 335 
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gac ate cac cgc tec geg cgc cac cac etc gee ttc ggc ttc ggc gtg 1056 
Asp He His Arg Ser Ala Arg His His Leu Ala Phe Gly Phe Gly Val 
340 345 350 

cac cag tgc ctg ggc cag aac etc gee egg ctg gag ctg gag gtc ate 1104 
His Gin Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Glu Val He 
355 360 365 

etc aac gcc etc atg gac cgc gtc ccg aeg ctg ega ctg gee gtc ccc 1152 
Leu Asn Ala Leu Met Asp Arg Val Pro Thr Leu Arg Leu Ala Val Pro 
370 375 380 

gtc gag cag ttg gtg ctg egg ccg ggt aeg aeg ate cag ggc gtc aac 1200 
Val Glu Gin Leu Val Leu Arg Pro Gly Thr Thr Tie Gin Gly Val Asn 
385 390 395 400 

gaa etc ccg gtc ace tgg tga 1221 
Glu Leu Pro Val Thr Trp 
405 

<210> 6 
<211> 406 
<212> PRT 

<213> Streptomyces griseolus 
<400> 6 

Met Thr Asp Thr Ala Thr Thr Pro Gin Thr Thr Asp Ala Pro Ala Phe 
15 10 15 

Pro Ser Asn Arg Ser Cys Pro Tyr Gin Leu Pro Asp Gly Tyr Ala Gin 
20 25 30 

Leu Arg Asp Thr Pro Gly Pro Leu His Arg Val Thr Leu Tyr Asp Gly 
35 40 45 

Arg Gin Ala Trp Val Val Thr Lys His Glu Ala Ala Arg Lys Leu Leu 
50 55 60 

Gly Asp Pro Arg Leu Ser Ser Asn Arg Thr Asp Asp Asn Phe Pro Ala 
65 70 75 80 

Thr Ser Pro Arg Phe Glu Ala Val Arg Glu Ser Pro Gin Ala Phe He 
85 90 95 

Gly Leu Asp Pro Pro Glu His Gly Thr Arg Arg Arg Met Thr He Ser 
100 105 110 

Glu Phe Thr Val Lys Arg He Lys Gly Met Arg Pro Glu Val Glu Glu 
115 120 125 

Val Val His Gly Phe Leu Asp Glu Met Leu Ala Ala Gly Pro Thr Ala 
130 135 140 

Asp Leu Val Ser Gin Phe Ala Leu Pro Val Pro Ser Met Val He Cys 
145 150 155 160 

Arg Leu Leu Gly Val Pro Tyr Ala Asp His Glu Phe Phe Gin Asp Ala 
165 170 175 

Ser Lys Arg Leu Val Gin Ser Thr Asp Ala Gin Ser Ala Leu Thr Ala 
180 185 190 

Arg Asn Asp Leu Ala Gly Tyr Leu Asp Gly L'eu He Thr Gin Phe Gin 
195 200 205 

Thr Glu Pro Gly Ala Gly Leu Val Gly Ala Leu Val Ala Asp Gin Leu 
210 215 220 
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Ala Asn Gly Glu He Asp Arg Glu Glu Leu He Ser Thr Ala Met Leu 
225 230 235 240 

Leu Leu He Ala Gly His Glu Thr Thr Ala Ser Met Thr Ser Leu Ser 
245 250 255 

Val He Thr Leu Leu Asp His Pro Glu Gin Tyr Ala Ala Leu Arg Ala 
260 265 270 

Asp Arg Ser Leu Val Pro Gly Ala Val Glu Glu Leu Le-u Arg Tyr Leu 

275 280 285 

Ala He Ala Asp He Ala Gly Gly Arg Val Ala Thr Ala Asp He Glu 

290 295 300 

Val Glu Gly His Leu He Arg Ala Gly Glu Gly Val He Val Val Asn 
305 310 315 320 

Ser He Ala Asn Arg Asp Gly Thr Val Tyr Glu Asp Pro Asp Ala Leu 
325 330 335 

Asp He His Arg Ser Ala Arg His His Leu Ala Phe Gly Phe Gly Val 

340 345 350 

His Gin Cys Leu Gly Gin Asn Leu Ala Arg Leu Glu Leu Glu Val He 

355 360 365 

Leu Asn Ala Leu Met Asp Arg Val Pro Thr Leu Arg Leu Ala Val Pro 

370 375 380 

Val Glu Gin Leu Val Leu Arg Pro Gly Thr Thr He Gin Gly Val Asn 
385 390 395 400 

Glu Leu Pro Val Thr Trp 
405 



<210> 7 
<211> 1404 
<212> DNA 

<213> Agrobacterium tumefaciens 

<220> 

<221> CDS 

<222> (1).-{1401) 

<223> coding for indoleacetamide hydrolase (tins2) 
<400> 7 

atg gtg ccc att acc teg tta gca caa acc eta gaa cgc ctg aga egg 4 8 
Met Val Pro He Thr Ser Leu Ala Gin Thr Leu Glu Arg Leu Arg Arg 
15 10 15 

aaa gac tac tec tgc tta gaa eta gta gaa act ctg ata geg cgt tgc 96 
Lys Asp Tyr Ser Cys Leu Glu Leu Val Glu Thr l*eu He Ala Arg Cys 

20 25 30 

eaa get gca aaa cca tta aat gee ctt ctg get aca gac tgg gat ggc 144 
Gin Ala Ala Lys Pro Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Gly 

35 40 45 

ttg egg cga age gee aaa aaa att gat cgt cat gga aac gee gga tta 192 
Leu Arg Arg Ser Ala Lys Lys He Asp Arg His Gly Asn Ala Gly Leu 

50 55 60 

ggt ctt tgc ggc att cca etc tgt ttt aag gcg aac ate geg ace ggc 240 
Gly Leu Cys Gly He Pro Leu Cys Phe Lys Ala Asn He Ala Thr Gly 
65 70 75 80 
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a-ta ttt: cct aca age get get aet ceg gcg ctg ata aac cae ttg eca 288 
He Phe Pro Thr Ser Ala Ala Thr Pro Ala Leu He Asn His Leu Pro 
85 90 95 

aag ata cea tec cgc gtc gca gaa aga ctt ttt tea get gga gea etg 336 
Lys He Pro Ser Arg Val Ala Glu Arg Leu Phe Ser Ala Gly Ala Leu 
100 105 110 

ceg ggt gee teg gga aae atg cat gag tta teg ttt gga att acg age 384 
Pro Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly He Thr Ser 
115 120 125 

aac aac tat gee acc ggt geg gtg egg aac ceg tgg aat cca agt ctg 432 
Asn Asn Tyr Ala Thr Gly Ala Val Arg Asn Fro Trp Asn Pro Ser Leu 
130 135 140 

ata cca gga ggc tea age ggt ggt gtg get get gcg gtg gea age cga 480 
He Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Ser Arg 
145 150 155 160 

ttg atg tta ggc ggc ata gge acc gat ace ggt gca tet gtt egc eta 528 
Leu Met Leu Gly Gly He Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 
165 170 175 

eec gca gee ctg tgt gge gta gta gga ttt cga ceg acg ctt get cga 576 
Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Leu Ala Arg 
180 185 190 

tat cca aga gat egg ata ata ceg gte age eec acc egg gac aee gee 624 
Tyr Pro Arg Asp Arg He He Pro Val Ser Pro Thr Arg Asp Thr Ala 
195 200 205 

gga ate ata gcg cag tge gta gee gat gtt ata ate etc gac eag gtg 672 
Gly He He Ala Gin Cys Val Ala Asp Val He He Leu Asp Gin Val 

210 215 220 

att tec gga egg teg gcg aaa att tea eec atg ecg ctg aag ggg ctt 720 
He Ser Gly Arg Ser Ala Lys He Ser Fro Met Fro Leu Lys Gly Leu 
225 230 235 240 

egg ate gge etc eec act acc tac ttt tac gat gac ett gat get gat 768 
Arg He Gly Leu Pro Thr Thr Tyr Phe Tyr Asp Asp Leu Asp Ala Asp 
245 250 255 

gtg gee ttc gca get gaa acg acg att cgc ttg eta gee aac aga ggc 816 
Val Ala Phe Ala Ala Glu Thr Thr He Arg Leu Leu Ala Asn Arg Gly 
260 265 270 

gta acc ttt gtt gaa gee gac ate ecc eac eta gag gaa ctg aat agt 864 
Val Thr Phe Val Glu Ala Asp He Pro His Leu Glu Glu Leu Asn Ser 
275 280 285 

ggg gca agt ttg cca att gcg ctt tac gaa ttt cea cac get eta aaa 912 
Gly Ala Ser Leu Pro He Ala Leu Tyr Glu Phe Pro His Ala Leu Lys 
290 295 300 

aag tat etc gac gat ttt gtg gga aca gtt tct ttt tet gac gtt ate 960 
Lys Tyr Leu Asp Asp Phe Val Gly Thr Val Ser Phe Ser Asp Val He 
305 310 315 320 

aaa gga att cgt age ccc gat gta gcg aac att gte agt geg caa att 1008 
Lys Gly He Arg Ser Pro Asp Val Ala Asn He Val Ser Ala Gin He 
325 330 335 

gat ggg cat caa att tec aac gat gaa tat gaa ctg gcg cgt caa tec 1056 
Asp Gly His Gin He Ser Asn Asp Glu Tyr Glu Leu Ala Arg Gin Ser 
340 345 350 
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ttc agg cca agg etc cag gcc act tat egg aat tac ttc aga etc tat 
Phe Arg Pro Arg Leu Gin Ala Thr Tyr Arg Asn Tyr Phe Arg Leu Tyr 

355 360 365 

cag tta gat gca ate ett ttc cca act gca cce tta gcg gcc aaa gee 
Gin Leu Asp Ala He Leu Phe Pro Thr Ala Pro Leu Ala Ala Lys Ala 

370 375 380 

ata ggt cag gag teg tea gte ate cac aat ggc tea atg atg aac act 
He Gly Gin Glu Ser Ser Val He His Asn Gly Ser Met Met Asn Thr 
385 390 395 400 

ttc aag ate tac gtg cga aat gtg gac cca age age aac gca ggc eta 
Phe Lys He Tyr Val Arg Asn Val Asp Pro Ser Ser Asn Ala Gly Leu 

405 410 415 

cet ggg ttg age ctt ect gee tgc ctt aca ect gat cgc ttg cet gtt 1296 
Pro Gly Leu Ser Leu Pro Ala Cys Leu Thr Pro Asp Arg Leu Pro Val 

420 425 430 

gga atg gaa att gat gga tta gcg ggg tea gac cac egt ctg tta gca 
Gly Met Glu He Asp Gly Leu Ala Gly Ser Asp His Arg Leu Leu Ala 

435 440 445 

ate ggg gca gca tta gaa aaa gcc ata aat ttt ect tec ttt cec gat 
He Gly Ala Ala Leu Glu Lys Ala He Asn Phe Pro Ser Phe Pro Asp 
450 455 460 

get ttt aat tag 
Ala Phe Asn 
465 

<210> 8 
<211> 4 67 
<212> FRT 

<213> Agrobacteriiaitt tumefaeiens 
<4 00> 8 

Met val pro He Thr Ser Leu Ala Gin Thr Leu Glu Arg Leu Arg Arg 

1 5 10 

LYS Asp Tyr Ser Cys Leu Glu Leu Val Glu Thr Leu lie Ala Arg Cys 

20 25 30 

Gin Ala Ala Lys Pro Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Gly 

35 40 45 

Leu Arg Arg Ser Ala Lys Lys lie Asp Arg His Gly Asa Ala Gly Leu 

50 55 60 

Gly Leu Cys Gly He Pro Leu Cys Phe Lys Ala Asn He Ala Thr Gly 
65 70 75 80 

lie Phe pro Thr Ser Ala Ala Thr Pro Ala Leu He Asn His Leu Pro 

85 90 
I,V8 He Pro Ser Arg Val Ala Glu Arg Leu Phe Ser Ala Gly Ala Leu 

' 100 105 110 

Pro Gly Ala Ser Gly Asn Het His Glu Leu Ser Phe Gly He Thr Ser 

115 120 125 

Asn Asn Tyr Ala Thr Gly Ala Val Arg Asn Pro Trp Asn Pro Ser Leu 

130 135 140 

He Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala val Ala Ser Arg 



145 150 
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Leu Met Leu Gly Gly He Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 
165 170 175 

Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Leu Ala Arg 
180 185 190 

Tyr Pro Arg Asp Arg He He Pro Val Ser Pro Thr Arg Asp Thr Ala 
195 200 205 

Gly He He Ala Gin Cys Val Ala Asp Val He He Leu Asp Gin Val 
210 215 220 

He Ser Gly Arg Ser Ala Lys He Ser Pro Met Pro X^u Lys Gly Leu 
225 230 235 240 

Arg He Gly Leu Pro Thr Thr Tyr Phe Tyr Asp Asp Leu Aap Ala Asp 
245 250 255 

Val Ala Phe Ala Ala GIm Thr Thr He Arg Leu Leu Ala Asn Arg Gly 
260 265 270 

Val Thr Phe Val Glu Ala Asp He Pro His Leu Glu Glu Leu Asn Ser 
275 280 285 

Gly Ala Ser Leu Pro He Ala Leu Tyr Glu Phe Pro His Ala Leu Lys 
290 295 300 

Lys Tyr Leu Asp Asp Phe Val Gly Thr Val Ser Phe Ser Asp Val He 
305 310 315 320 

Lys Gly He Arg Ser Pro Asp Val Ala Asn He Val Ser Ala Gin He 
325 330 335 

Asp Gly His Gin He Ser Asn Asp Glu Tyr Glu Leu Ala Arg Gin Ser 
340 345 350 

Phe Arg Pro Arg Leu Gin Ala Thr Tyr Arg Asn Tyr Phe Arg Leu Tyr 
355 360 365 

Gin Leu Asp Ala He Leu Phe Pro Thr Ala Pro Leu Ala Ala Lys Ala 

370 375 380 

He Gly Gin Glu Ser Ser Val He His Asn Gly Ser Met Met Asn Thr 
385 390 395 400. 

Phe Lys He Tyr Val Arg Asn Val Asp Pro Ser Ser Asn Ala Gly Leu 
405 410 415 

Pro Gly Leu Ser Leu Pro Ala Cys Leu Thr Pro Asp Arg Leu Pro Val 
420 425 430 

Gly Met Glu He Asp Gly Leu Ala Gly Ser Asp His Arg Leu Leu Ala 
435 440 445 

He Gly Ala Ala Leu Glu Lys Ala He Asn Phe Pro Ser Phe Pro Asp 
450 455 460 

Ala Phe Asn 
465 



<210> 9 

<211> 1404 
<212> DNA 

<213> Agrobacterium tumefaciens 

<220> 
<221> CDS 
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<222> (1)..(1401) 

<223> coding for indoleacetamide hydrolase (tms2) 

atg^gtg ccc att acc teg tta gca caa acc eta gaa cge ctg aga egg 48 
Met val Pro lie Thr Ser Leu Ala Gin Thr Leu Glu Arg Leu Arg Arg 

1 5 10 15 

aaa gac tac tec tgc tta gaa eta gta gaa act ctg ata gcg ogt tgc 96 
Lvs Asp Tyr Ser Cys Leu Glu Leu Val Glu Thr Leu He Ala Arg Cys 

20 25 30 

caa get gca aaa eca tta aat gcc ctt ctg get aca gac tgg gat ggc 144 
Gin Ala Ala Lya Pro Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Gly 

35 40 45 

tta caa cga age gcc aaa aaa att gat cgt eat gga aac gcc gga tta 192 
Leu Axg Arg Ser Ala Lys Lys He Asp Arg His Gly Asn Ala Gly Leu 

50 55 60 

ggt ctt tgc ggc att cca etc tgt ttt aag gcg aac ate gcg acc ggc 240 
Gly Leu Cys Gly He Pro Leu Cys Phe Lys Ala Asn He Ala Thr Gly 

65 70 75 

ata ttt cct aca age get get act ccg geg ctg ata aac cae ttg cca 288 
He Phe Pro Thr Ser Ala Ala Thr Pro Ala Leu He Asn His Leu Pro 

85 90 55 

aag ata eca tec cge gtc gca gaa age ctt ttt tea get gga gea ctg 336 
Lys He Pro Ser Arg Val Ala Glu Arg Leu Phe Ser Ala Gly Ala Leu 

100 105 110 

ccg ggt gcc teg gga aac atg cat gag tta teg ttt gga att acg age 384 
III gS La se? Giy Asn Met His Glu Leu Ser Phe Gly He Thr Ser 

115 120 125 

aac aac tat gee acc ggt gcg gtg egg aac ccg tgg aat cca agt ctg 432 
Asn Asn Tyr Ala Thr Gly Ala Val Arg Asn Pro Trp Asn Pro Ser Leu 

130 135 140 

ata eca gga ggc tea age ggt ggt gtg get get gcg gtg gca age cga 480 
lie pro lly liy Ser Ser Gly Gly Val Ala Ala Ala Val Ala Ser Arg 
145 150 155 

ttg atg tta ggc ggc ata ggc acc gat acc ggt gea tct gtt cge eta 528 
L^u Met Leu Giy lly He Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 

165 170 175 

ccc gca gee ctg tgt ggc gta gta gga ttt cga ccg acg ctt get cga 576 
Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Leu Ala Arg 

180 185 1»0 

tat cea aga gat egg ata ata ccg gtc age ccc acc egg gac acc gee 624 
pro A^g ASP A^g He He Pro Val Ser Pro Thr Arg Asp Thr Ala 
195 200 205 

gga ate ata gcg eag tgc gta gcc gat gtt ata ate etc gat cag gtg 672 
III He He Ala Gin Cys Val Ala Asp Val He He Leu Asp Gin Val 

210 215 220 

att tec qga egg teg gcg aaa att tea ccc atg ccg ctg aag ggg ctt 720 
He ser G?y A^g Ser 111 Lys He Ser Pro Met Pro Leu Lys Gly Leu 
225 230 235 240 

egg ate ggc etc ccc act acc tac ttt tac gat gac ctt gat get gat 768 
Arg He Gly Leu Pro Thr Thr Tyr Phe Tyr Asp Asp Leu Asp Ala Asp 
245 250 255 
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gtg gcc ttc gca get gaa acg acg att cgc ttg eta gcc aac aga ggc 
Val Ala Phe Ala Ala Glu Thr Thr lie Arg Leu Leu Ala Asn Arg Gly 
260 265 270 

gta acc ttt gtt gaa gcc gac ate ccc cac eta gag gaa ctg aat agt 
Val Thr Phe Val Glu Ala Asp lie Pro His Leu Glu Glu Leu Asn Ser 
275 280 285 

999 9ca agt ttg cca att gcg ctt tac gaa ttt cca cac get eta aaa 
Gly Ala Ser Leu Pro lie Ala Leu Tyr Glu Phe Pro His Ala Leu Lys 
290 295 300 

aag tat etc gac gat ttt gtg gga aea gtt tet ttt tct gac gtt ate 
Lys Tyr Leu Asp Asp Phe Val Gly Thr Val Ser Phe Ser Asp Val lie 
305 310 315 320 

aaa gga att cgt age ecc gat gta geg aac att gte agt gcg caa att 
Lys Gly lie Arg Ser Pro Asp Val Ala Asn lie Val Ser Ala Gin He 
325 330 335 

gat ggg cat caa att tec aac gat gaa tat gaa ctg geg cgt caa tec 
Asp Gly His Gin He Ser Asn Asp Glu Tyr Glu Leu Ala Arg Gin Ser 
340 345 350 

ttc agg cca agg etc eag gcc act tat egg aat tac ttc aga etc tat 
Phe Arg Pro Arg Leu Gin Ala Thr Tyr Arg Asn Tyr Phe Arg Leu Tyr 
355 360 365 

eag tta gat gca ate ctt ttc cca act gca ccc tta gcg gcc aaa gcc 
Gin Leu Asp Ala He Leu Phe Pro Thr Ala Pro Leu Ala Ala Lys Ala 
370 375 380 

ata ggt eag gag teg tea gtc ate cac aat ggc tea atg ata aac act 
He Gly Gin Glu Ser Ser Val He His Asa Gly Ser Met He Asn Thr 
3B5 390 395 400 

ttc aag ate tac gtg cga aat gtg gac cca age age aac gca ggc eta 
Phe Lys He Tyr Val Arg Asn Val Asp Pro Ser Ser Asn Ala Gly Leu 
405 410 415 

cct ggg ttg age ctt cct gcc tgc ctt aea cet gat cgc ttg ect gtt 
Pro Gly Leu Ser Leu Pro Ala Cys Leu Thr Pro Asp Arg Leu Pro Val 
420 425 430 

gga atg gaa att gac gga tta gcg ggg tea gac cac cgt ctg tta gca 
Gly Met Glu He Asp Gly Leu Ala Gly Ser Asp His Arg Leu Leu Ala 
435 440 445 

ate ggg gca gca tta gaa aaa gcc ata aat ttt ect tec ttt ccc gat 
He Gly Ala Ala Leu Glu Lys Ala He Asn Phe Pro Ser Phe Pro Asp 
450 455 460 

get ttt aat tag 
Ala Phe Asn 
465 

<210> 10 
<211> 467 
<212> PRT 

<213> Agrobacteriiun tusiefaciens 
<400> 10 

Met Val Pro He Thr Ser Leu Ala Gin Thr Leu Glu Arg Leu Arg Arg 
15 10 15 
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Lys Asp Tyr Scr Cys Leu Glu Leu Val Glu Thr Leu lie Ala Arg Cys 
20 25 30 

Gin Ala Ala Lys Pro Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Gly 

35 40 • 45 

Leu Arg Arg Ser Ala Lys Lys lie Asp Arg His Gly Asn Ala Gly Leu 

50 55 60 

Gly Leu Cys Gly lie Pro Leu Cys Phe Lys Ala Asn lie Ala Thr Gly 
65 70 75 BO 

lie Phe Pro Thr Ser Ala Ala Thr Pro Ala Leu lie Asn His Leu Pro 

85 90 ^5 

Lys He Pro Ser Arg Val Ala Glu Arg Leu Phe Ser Ala Gly Ala Leu 
100 105 110 

Pro Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly He Thr Ser 
115 120 125 

Asn Asn Tyr Ala Thr Gly Ala Val Arg Asn Pro Trp Asn Pro Ser Leu 

130 135 140 

He Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Ser Arg 
145 150 155 160 

Leu Met Leu Gly Gly He Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 
165 170 175 

Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Leu Ala Arg 
180 185 190 

Tyr Pro Arg Asp Arg He He Pro Val Ser Pro Thr Arg Asp Thr Ala 
195 200 205 

Gly He He Ala Gin Cys Val Ala Asp Val He He Leu Asp Gin Val 

210 215 220 

He ser Gly Arg Ser Ala Lys He Ser Pro Met Pro Leu Lys Gly Leu 
225 230 235 240 

Arg He Gly Leu Pro Thr Thr Tyr Phe Tyr Asp Asp Leu Asp Ala Asp 
245 250 255 

Val Ala Phe Ala Ala Glu Thr Thr He Arg Leu Leu Ala Asn Arg Gly 

260 265 270 

val Thr Phe Val Glu Ala Asp He Pro His Leu Glu Glu Leu Asn Ser 

275 280 285 

Gly Ala Ser Leu Pro He Ala Leu Tyr Glu Phe Pro His Ala Leu Lys 

290 295 300 

Lvs Tyr Leu Asp Asp Phe Val Gly Thr Val Ser Phe Ser Asp Val He 
3^5 310 315 320 

Lys Gly He Arg Ser Pro Asp Val Ala Asn He Val Ser Ala Gin He 
325 330 335 

Asp Gly His Gin He Ser Asn Asp Glu Tyr Glu Leu Ala Arg Gin Ser 

340 345 350 

Phe Arg Pro Arg Leu Gin Ala Thr Tyr Arg Asn Tyr Phe Arg Leu Tyr 

355 360 365 

Gin Leu Asp Ala He Leu Phe Pro Thr Ala Pro Leu Ala Ala Lys Ala 
370 375 380 



CA 02493364 2005-01-21 



PF 53790 



17 

lie Gly Gin Glu Ser Ser Val lie His Asn Gly Ser Met lie Asn Thr 
385 390 395 400 

Phe Lys lie Tyr Val Arg Asn Val Asp Pro Ser Ser Asn Ala Gly Leu 
405 410 415 

Pro Gly Leu Ser Leu Pro Ala Cys Leu Thr Pro Asp Arg Leu Pro Val 
420 425 430 

Gly Met Glu lie Asp Gly Leu Ala Gly Ser Asp His Arg Leu Leu Ala 
435 440 445 

lie Gly Ala Ala Leu Glu Lys Ala lie Asn Phe Pro Ser Phe Pro Asp 
450 455 460 

Ala Phe Asn 
465 



<210> 11 
<211> 609 
<212> DNA 

<213> Xanthobacter autotrophicus 

<220> 

<221> CDS 

<222> (!)..( 603) 

<223> coding for haloalkane dehalogenase 
<400> 11 

atg tea acg ttt ttt gaa ccg gag aac gga atg aaa caa aac gcc aaa 48 
Met Ser Thr Phe Phe Glu Pro Glu Asn Gly Met Lys Gin Asn Ala Lys 
15 10 15 

acc gaa cga ate ctg gat gtc gcg etc gaa ttg ctt gag aca gag ggt 96 
Thr Glu Arg lie Leu Asp Val Ala Leu Glu Leu Leu Glu Thr Glu Gly 
20 25 30 

gag ttt ggt ttg acg atg agg cag gtg gca acg caa gcg gac atg tec 144 
Glu Phe Gly Leu Thr Met Arg Gin Val Ala Thr Gin Ala Asp Met Ser 
35 40 45 

ctg age aac gtt cag tac tat ttc aag tec gag gac ctg etc etc gtg 192 
Leu Ser Asn Val Gin Tyr Tyr Phe Lys Ser Glu Asp I*eu Leu Leu Val 
50 55 60 

gcc atg gca gac egt tac ttt caa egg tgc ctg aca acc atg get gag 240 
Ala Met Ala Asp Arg Tyr Phe Gin Arg Cys Leu Thr Thr Met Ala Glu 
65 70 75 80 

eat ccg ecc tta teg gca ggg cgt gat caa cac gcc cag tta aga gcg 28 B 
His Pro Pro Leu Ser Ala Gly Arg Asp Gin His Ala Gin Leu Arg Ala 
85 90 95 

ttg tta cga gaa ctg etc ggt cat ggt ctt gag att tec gag atg tgt 336 
Leu Leu Arg Glu Leu Leu Gly His Gly Leu Glu lie Ser Glu Met Cys 
100 105 110 

cga ata ttc agg gag tac tgg gca ate gcc acc cgt aat gaa act gtt 384 
Arg lie Phe Arg Glu Tyr Trp Ala lie Ala Thr Arg Asn Glu Thr Val 
115 120 125 

cac ggc tat etc aag teg tac tat egg gat etc gee gaa gtg atg get 432 
His Gly Tyr Leu Lys Ser Tyr Tyr Arg Asp Leu Ala Glu Val Met Ala 
130 135 140 
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gag aag ctt gcg cca ctg gcc age age gaa aag gcg ctg gcc gtg gee 480 
Glu Lys Lett Ala Pro Leu AXa Ser Ser Glu Lys Ala Leu Ala Val Ala 
145 150 155 160 

Ota tct ttg gtt att cct tat gtt gag ggg tat teg gta acg gee att 528 
Val Ser Leu Val He Pro Tyr Val Glu Gly Tyr ser val Thr Ala He 
165 170 175 

gea atg ccc gaa tec att gat acg att tec gag acg ctg acc aat gtg 576 
Ala Met Pro Glu Ser He Asp Thr He Ser Glu Thr Leu Thr Asn Val 

180 185 
gtg ttg gag cag ctt cgc ate age aat tcatga 
val Leu Glu Gin Leu Arg He Ser Asn 
195 200 

<210> 12 
<211> 201 
<212> PRT 

<213> Xanthobacter autctrophicus 

Mit°sei^Thr Phe Phe Glu Pro Glu Asn Gly Met Lys Gin Asn Ala Lys 

1 5 10 " 

Thr Glu Arg He Leu Asp Val Ala Leu Glu Leu Leu Glu Thr Glu Gly 

20 25 30 

Glu Phe Gly Leu Thr Met Arg Gin Val Ala Thr Gin Ala Asp Met Ser 

35 40 45 

Leu Ser Asn Val Gin Tyr Tyr Phe Lys Ser Glu Asp Leu Leu Leu Val 

50 55 60 

Ala Met Ala Asp Arg Tyr Phe Gin Arg Cys Leu Thr Thr Met Ala Glu 
65 70 75 80 

His pro pro Leu Ser Ala Gly Arg Asp Gin His Ala Gin Leu Arg Ala 

85 90 95 

Leu Leu Arg Glu Leu Leu Gly His Gly Leu Glu He Ser Glu Met Cys 

100 105 110 

Arg He Phe Arg Glu Tyr Trp Ala He Ala Thr Arg Asn Glu Thr Val 

115 120 125 

His Gly Tyr Leu Lys Ser Tyr Tyr Arg Asp Leu Ala Glu Val Met Ala 

130 135 1*0 

Glu Lys Leu Ala Pro Leu Ala Ser Ser Glu Lys Ala Leu Ala Val Ala 
145 150 155 1«0 

Val Ser Leu Val He Pro Tyr Val Glu Gly Tyr Ser Val Thr Ala He 

165 170 I'S 

Ala Met Pro Glu Ser He Asp Thr He Ser Glu Thr Leu Thr Asn Val 
180 185 190 

Val Leu Glu Gin Leu Arg He Ser Asn 
195 200 



<210> 13 
<211> 1131 
<212> DNA 

<213> Herpes simplex virus 1 
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<220> 

<221> CDS 

<222> (1)..(1128) 

<223> coding for 1:hymi.dlne kinase (TK> 
<400> 13 

atg get teg "tac ccc tgc cat. caa cac gcg tct gcg ttc gac cag get 48 
Met Ala Ser Tyr Pro Cys His GXn His Ala Ser Ala Phe Asp Gin. Ala 
IS 10 15 

gcg cgt tct cgc ggc cat age aac cga cgt. acg gcg ttg cgc cct cgc 96 
Ala Arg Ser Arg Gly His Ser Asn Arg Arg Thr Ala Leu Arg Pro Arg 
20 25 30 

egg cag caa gaa gcc acg gaa gtc cgc ctg gag cag aaa atg ccc acg 144 
Arg Gin Gin Glu Ala Tbr Glu Val Arg Leu Glu Gin Lys Met Pro Thr 
35 40 45 

eta ctg egg gtt tat ata gac ggt cct cac ggg atg ggg aaa acc acc 192 
Leu Leu Arg Val Tyr lie Asp Gly Fro His Gly Met Gly Lys Thr Thr 
50 55 60 

acc acg caa ctg ctg gtg gcc ctg ggt teg cgc gac gat ate gtc -tac 240 
Thr Thr Gin Leu Leu Val Ala Leu Gly Ser Arg Asp Asp lie Val Tyr 
65 70 75 80 

gta ccc gag ccg atg act tac tgg cag gtg ctg ggg get tec gag aca 288 
Val Pro Glu Pro Met Thr Tyr Trp Gin Val Leu Gly Ala Ser Glu Thr 
85 90 95 

ate gcg aac ate tac acc aca caa cac cgc etc gac cag ggt gag ata 336 
Tie Ala Asn lie Tyr Thr Thr Gin His Arg Leu Asp Gin Gly Glu lie 
100 105 1X0 

teg gcc ggg gac gcg gcg gtg gta atg aca age gcc cag ata aca atg 384 
Ser Ala Gly Asp Ala Ala Val Val Met Thr Ser Ala Gin lie Thr Met 
115 120 125 

ggc atg cct tat gee gtg ace gac gcc gtt ctg get. cct cat gtc ggg 432 
Gly Met Pro Tyr Ala Val Thr Asp Ala Val Leu Ala Pro His Val Gly 
130 135 140 

ggg gag get ggg agt tea cat gee ccg ccc ccg gcc etc ace etc ate 480 
Gly Glu Ala Gly Ser Ser His Ala Fro Pro Pro Ala Leu Thr Leu lie 
145 150 155 160 

ttc gac cgc cat ccc ate gcc gcc etc ctg tgc tac ccg gcc gcg cga 528 
Phe Asp Arg Bis Pro lie Ala Ala Leu Leu Cys Tyr Pro Ala Ala Arg 
165 170 175 

tac ett atg ggc age atg acc ccc cag gcc gtg ctg gcg ttc gtg gcc 576 
Tyr Leu Met Gly Ser Met Thr Pro Gin Ala Val Leu Ala Phe Val Ala 
180 185 190 

etc ate ccg ccg acc ttg ccc ggc aca aac ate gtg ttg ggg gcc ett 624 
Leu lie Pro Pro Thr Leu Pro Gly Thr Asn lie Val Leu Gly Ala Leu 
195 200 205 

ccg gag gac aga cac ate gac cgc ctg gcc aaa cgc cag cgc ccc ggc 672 
Pro Glu Asp Arg His lie Asp Arg Leu Ala Lys Arg Gin Arg Pro Gly 
210 215 220 

gag egg ett gac ctg get atg ctg gcc gcg att cgc cgc gtt tac ggg 720 
Glu Arg Leu Asp Leu Ala Met Leu Ala Ala lie Arg Arg Val Tyr Gly 
225 230 235 240 



CA 02493364 2005-01-21 



PF 53790 



20 

ctg ctt gcc aat acg gtg egg tat ctg cag ggc ggc ggg teg tgg tgg 7 68 
Leu Leu Ala Asn Thr Val Arg Tyr Leu Gin Gly Gly Gly Ser Trp Trp 
245 250 255 

gag gat tgg gga cag ctt teg ggg acg gcc gtg ccg ccc cog ggt gcc 816 
Glu Asp Trp Gly Gin Leu Ser Gly Thr Ala Val Pro Pro Gin Gly Ala 
260 265 270 

gag ccc cag age aac gcg ggc cca cga ccc cat ate ggg gac acg tta B64 
Glu Pro Gin Ser Asn Ala Gly Pro Arg Pro His He Gly Asp Thr Leu 
275 280 285 

ttt acc ctg ttt egg gcc ccc gag ttg ctg gcc ccc aac ggc gac ctg 912 
Phe Thr Leu Phe Arg Ala Pro Glu Leu Leu Ala Pro Asn Gly Asp Leu 
290 295 300 

tat aac gtg ttt gcc tgg gcc ttg gac gtc ttg gcc aaa cgc etc cgt 960 
Tyr Asn Val Phe Ala Trp Ala Leu Asp Val Leu Ala Lys Arg Leu Arg 
305 310 315 320 

ccc atg cac gtc ttt ate ctg gat tac gac caa teg ccc gcc ggc tgc 1008 
Pro Met His Val Phe He Leu Asp Tyr Asp Gin Ser Pro Ala Gly Cys 
325 330 335 

egg gac gcc ctg ctg caa ctt acc tec ggg atg gtc cag acc cac gtc 1056 
Arg Asp Ala Leu Leu Gin Leu Thr Ser Gly Met Val Gin Thr His Val 
340 345 350 

acc acc cca ggc tec ata ccg acg ate tgc gac ctg gcg cgc acg ttt 
Thr Thr Pro Gly Ser He Pro Thr He Cys Asp Leu Ala Arg Thr Phe 
355 360 365 

gcc egg gag atg ggg gag get aac tga 
Ala Arg Glu Met Gly Glu Ala Asn 
370 375 

<210> 14 
<211> 376 
<212> PRT 

<213> Herpes simplex virus 1 
<400> 14 

Met Ala Ser Tyr Pro Cys His Gin His Ala Ser Ala Phe Asp Gin Ala 
15 10 15 

Ala Arg Ser Arg Gly His Ser Asn Arg Arg Thr Ala Leu Arg Pro Arg 
20 25 30 

Arg Gin Gin Glu Ala Thr Glu Val Arg Leu Glu Gin Lys Met Pro Thr 
35 40 45 

Leu Leu Arg Val Tyr He Asp Gly Pro His Gly Met Gly Lys Thr Thr 
50 55 60 

Thr Thr Gin Leu Leu Val Ala Leu Gly Ser Arg Asp Asp He Val Tyr 
65 70 75 80 

Val Pro Glu Pro Met Thr Tyr Trp Gin Val Leu Gly Ala Ser Glu Thr 
85 90 95 

He Ala Asn He Tyr Thr Thr Gin His Arg Leu Asp Gin Gly Glu He 
100 105 110 

Ser Ala Gly Asp Ala Ala Val Val Met Thr Ser Ala Gin He Thr Met 
115 120 125 



1104 
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Gly Met Pro Tyr Ala Val Thr Asp Ala Val Leu Ala Pro His Val Gly 
130 135 140 

Gly Glu Ala Gly Ser Ser His Ala Pro Pro Pro Ala Leu Thr Leu lie 
145 150 155 160 

Phe Asp Arg His Pro lie Ala Ala Leu Leu Cys Tyr Pro Ala Ala Arg 
165 170 175 

Tyr Leu Ke^ Gly Ser Met; Thr Pro Gin Ala Val Leu Ala Phe Val Ala 
180 185 190 

Leu lie Pro Pro Thr Leu Pro Gly Thr Asxi lie Val Leu Gly Ala Leu 
195 200 205 

Pro Glu Asp Arg His lie Asp Arg X#eu Ala Lys Arg Gin Arg Pro Gly 
210 215 220 

Glu Arg Leu Asp Leu Ala Met Leu Ala Ala He Arg Arg Val Tyr Gly 
225 230 235 240 

Leu Leu Ala Asn Thr Val Arg Tyr Leu Gin Gly Gly Gly Ser Trp Trp 
245 250 255 

Glu Asp Trp Gly Gin Leu Ser Gly Thr Ala Val Pro Pro Gin Gly Ala 
260 265 270 

Glu Pro Gin Ser Asn Ala Gly Pro Arg Pro His He Gly Asp Thr Leu 

275 280 285 

Phe Thr Leu Phe Arg Ala Pro Glu Leu Leu Ala Pro Aszi Gly Asp Leu 
290 295 300 

Tyr Asn Val Phe Ala Trp Ala Leu Asp Val Leu Ala Lys Arg Leu Arg 
305 310 315 320 

Pro Met His Val Phe He Leu Asp Tyr Asp Gin Ser Pro Ala Gly Cys 
325 330 335 

Arg Asp Ala Leu Leu Gin Leu Thr Ser Gly Met Val Gin Thr His Val 
340 345 350 

Thr Thr pro Gly Ser He Pro Thr He Cys Asp Leu Ala Arg Thr Phe 
355 360 365 

Ala Arg Glu Met Gly Glu Ala Asn 
370 375 



<210> 15 
<211> 1131 
<212> DNA 

<213> Herpes simplex virus 1 

<220> 

<221> CDS 

<222> (1}..(1128) 

<223> coding for thymidine kinase (TX) 
<400> 15 

atg get teg tac ccc tgc cat caa cac gcg tct gcg ttc gac cag get 48 
Met Ala Ser Tyr Pro Cys His Gin Bis Ala Ser Ala Phe Asp Gin Ala 
15 10 15 

gcg cgt tct cgc ggc cat age aac ega cgt acg gcg ttg cgc cct cgc 96 
Ala Arg Ser Arg Gly His Ser Asn Arg Arg Thr Ala I#eu Arg Pro Arg 
20 25 30 
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egg cag caa gaa gcc acg gaa gtc cgc ctg gag cag aaa atg ccc acg 144 
Arg Gin Gin Glu Ala Thr Glu Val Arg Leu Glu Gin Lys Met Pro Thr 

35 40 45 

eta ctg egg gtt tat ata gac ggt cct cac ggg atg ggg aaa acc acc 192 
Leu Leu Arg Val Tyr lie Asp Gly Pro His Gly Met Gly Lys Thr Thr 

50 55 60 

acc acg caa ctg ctg gtg gcc ctg ggt teg cgc gac gat ate gtc tac 240 
Thr Thr Gin Leu Leu Val Ala Leu Gly Ser Arg Asp Asp He Val Tyr 
65 70 75 80 

gta ccc gag ccg atg act tac tgg cag gtg ctg ggg get tec gag aca 288 
val Pro Glu Pro Met Thr Tyr Trp Gin Val Leu Gly Ala Ser Glu Thr 

85 90 95 

ate gcg aae ate tac acc aca caa cac cgc etc gac cag ggt gag ata 336 
He Ala Asn He Tyr Thr Thr Gin His Arg I.eu Asp Gin Gly Glu He 

100 105 110 

teg gcc ggg gac gcg gcg gtg gta atg aca age gcc cag ata aca atg 384 
Ser Ala Gly Asp Ala Ala Val Val Met Thr Ser Ala Gin He Thr Met 

115 120 125 

ggc atg cct tat gcc gtg acc gac gee gtt ctg get cct eat gtc ggg 432 
Gly Met Pro Tyr Ala Val Thr Asp Ala Val Leu Ala Pro His Val Gly 

130 135 140 

ggg gag get ggg agt tea cat gcc ccg ccc ccg gcc etc acc etc ate 
Gly Glu Ala Gly Ser Ser His Ala Pro Pro Pro Ala Leu Thr Leu He 
145 150 155 160 

ttc gac cgc eat ccc ate gcc gcc etc ctg tgc tac ccg gcc gcg cga 528 
Phe Asp Arg His Pro He Ala Ala Leu Leu Cys Tyr Pro Ala Ala Arg 

165 170 175 

tac ctt atg ggc age atg ace ccc cag gcc gtg ctg gcg ttc gtg gcc 576 
Tyr Leu Met Gly Ser Met Thr Pro Gin Ala Val Leu Ala Phe Val Ala 
180 185 190 

etc ate ccg ccg acc ttg ccc ggc aca aae ate gtg ttg ggg gee ctt 624 
Leu He Pro Pro Thr Leu Pro Gly Thr Asn He Val Leu Gly Ala Leu 

195 200 205 

ccg gag gac aga cac ate gac cgc ctg gcc aaa cgc cag cgc ccc ggc 672 
Pro Glu Asp Arg His He Asp Arg Leu Ala Lys Arg Gin Arg Pro Gly 

210 215 220 

gag egg ctt gac ctg get atg ctg gcc gcg att cgc cgc gtt tac ggg 720 
Glu Arg Leu Asp Leu Ala Met Leu Ala Ala He Arg Arg Val Tyr Gly 
225 230 235 240 

ctg ctt gcc aat acg gtg egg tat ctg cag ggc ggc ggg teg tgg tgg 768 
Leu Leu Ala Asn Thr Val Arg Tyr Leu Gin Gly Gly Gly Ser Trp Trp 
245 250 255 

gag gat tgg gga cag ctt teg ggg acg gcc gtg ccg ccc cag ggt gcc 816 
Glu Asp Trp Gly Gin Leu Ser Gly Thr Ala Val Pro Pro Gin Gly Ala 

260 265 270 

gag ccc cag age aae gcg ggc cca cga ccc cat ate ggg gac acg tta 864 
Glu Pro Gin Ser Asn Ala Gly Pro Arg Pro His He Gly Asp Thr Leu 

275 280 285 

ttt ace ctg ttt egg gcc ccc gag ttg ctg gcc ccc aae ggc gac ctg 912 
Phe Thr Leu Phe Arg Ala Pro Glu Leu Leu Ala Pro Asn Gly Asp Leu 
290 295 300 
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tat aac gtg ttt gcc tgg gcc ttg gac gtc ttg gcc aaa cgc etc cgt 
Tyr Asn Val Phe Ala Trp Ala Leu Asp Val Leu Ala Lys Arg Leu Arg 
305 310 315 320 

ccc atg cac gtc ttt ate etg gat tac gac caa teg ccc gcc gge tgc 
Pro Met His Val Phe He Leu Asp Tyr Asp Gin Ser Pro Ala Gly Cys 
325 330 335 

egg gac gcc ctg ctg caa ctt acc tec ggg atg gtc cag acc cac gtc 
Arg Asp Ala Leu Leu Gin Leu Thr Ser Gly Met Val Gin Thr His Val 
340 345 350 

acc acc cca ggc tec ata ccg acg ate tgc gac ctg gcg cgc acg ttt 
Thr Thr Pro Gly Ser He Pro Thr He Cys Asp Leu Ala Arg Thr Phe 
355 360 365 

gcc egg gag atg ggg gag get aac tga 
Ala Arg Glu Met Gly Glu Ala Asn 
370 375 

<210> 16 
<211> 376 
<212> PRT 

<213> Herpes siicq^lex virus 1 
<400> 16 

Met Ala Ser Tyr Pro Cys His Gin His Ala Ser Ala Phe Asp Gin Ala 
15 10 15 

Ala Arg Ser Arg Gly His Ser Asn Arg Arg Thr Ala Leu Arg Pro Arg 
20 25 30 

Arg Gin Gin Glu Ala Thr Glu Val Arg Leu Glu Gin Lys Met Pro Thr 
35 40 45 

Leu Leu Arg Val Tyr He Asp Gly Pro His Gly Met Gly Lys Thr Thr 
50 . 55 60 

Thr. Thr Gin Leu Leu Val Ala Leu Gly Ser Arg Asp Asp He Val Tyr 
65 70 75 80 

Val Pro Glu Pro Met Thr Tyr Trp Gin Val Leu Gly Ala Ser Glu Thr 
85 90 95 

He Ala Asn He Tyr Thr Thr Gin His Arg Leu Asp Gin Gly Glu He 
100 105 110 

Ser Ala Gly Asp Ala Ala Val val Met Thr Ser Ala Gin He Thr Met 
115 120 125 

Gly Met Pro Tyr Ala Val Thr Asp Ala Val Leu Ala Pro His Val Gly 
130 135 140 

Gly Glu Ala Gly Ser Ser His Ala Pro Pro Pro Ala Leu Thr Leu He 
145 150 155 160 

Phe Asp Arg His Fro He Ala Ala Leu Leu Cys Tyr Pro Ala Ala Arg 
165 170 175 

Tyr Leu Met Gly Ser Met Thr Pro Gin Ala Val Leu Ala Phe Val Ala 
180 185 190 

Leu He Pro Pro Thr Leu Pro Gly Thr Asn He Val Leu Gly Ala Leu 
195 200 205 

Pro Glu Asp Arg His He Asp Arg Leu Ala Lys Arg Gin Arg Pro Gly 
210 215 220 
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Glu Arg Leu Asp I>eu Ala Met: Leu Ala Ala lie Arg Arg Val Tyr Gly 
225 230 235 240 

Leu Leu Ala Asn Thr Val Arg Tyr Leu Gin Gly Gly Gly Ser Trp Trp 
245 250 255 

Glu Asp Trp Gly Gin Leu Ser Gly Thr Ala Val Pro Pro Gin Gly Ala 
260 265 270 

Glu Pro Gin Ser Asn Ala Gly Pro Arg Fro His lie Gly Asp Thr Leu 
275 280 285 

Phe Thr Leu Phe Arg Ala Pro Glu Leu Leu Ala Pro Asn Gly Asp Leu 
290 295 300 

Tyr Asn Val Phe Ala T3Cp Ala Leu Asp Val Leu Ala Lys Arg Leu Arg 
305 3X0 315 320 

Pro Me-b Els Val Phe lie Leu Asp Tyr Asp Gin Ser Pro Ala Gly Cys 
325 330 335 

Arg Asp Ala Leu Leu Gin Leu Thr Ser Gly Met Val Gin Thr His Val 
340 345 350 

Thr Thr Pro Gly Ser lie Pro Thr lie Cys Asp Leu Ala Arg Thr Phe 
355 360 365 

Ala Arg Glu Met: Gly Glu Ala Asn 
370 375 



<210> 17 
<211> B40 
<212> DNA 

<213> Toxoplasma gondii 

<220> 

<221> CDS 

<222> (1).J(837) 

<223> coding for hypoxanthine -xanthine -guanine 
phosphor ibosyl transferase (HXGFRTase) 

<400> 17 

atg gcg tec aaa ccc att gaa gaa tec egg teg caa aaa egg agt gcc 48 
Met Ala Ser Lys Pro lie Glu Glu Ser Arg Ser Gin Lys Arg Ser Ala 
15 10 15 

ttc tea gae ate ttc tgt tgt tgc act cct aat gaa ggg get ate gtg 96 
Phe Ser Asp lie Phe Cys Cys Cys Thr Pro Asn Glu Gly Ala lie Val 
20 25 30 

ccc agt gac cca atg gtc tec acc agt get cca gca cgc ace agt get 144 
Pro Ser Asp Pro Met Val Ser Thr Ser Ala Pro Ala Arg Thr Ser Ala 
35 40 45 

cca gcg cgc tec agt gca ctt caa gac tae ggc aag ggc aag ggc cgt 192 
Pro Ala Arg Ser Ser Ala Leu Gin Asp Tyr Gly Lys Gly Lys Gly Arg 
50 55 . 60 

att gag cec atg tat ate ecc gae aae aec ttc tae aac get gat: gac 240 
lie Glu Pro Met Tyr lie Pro Asp Asn Thr Phe Tyr Asn Ala Asp Asp 
65 70 75 80 

ttt ctt gtg ccc ccc cac tgc aag ccc tac att gac aaa ate etc etc 288 
Phe Leu Val Pro Pro His Cys Lys Pro Tyr lie Asp Lys lie Leu Leu 
85 90 95 
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ccb ggt gga "b-tg gtc aag gac aga g'tt. gag aag ttg gcg 'ta'b gac ate 336 
Pro Gly Gly I.eu Val Lys Asp Arg Val Glu Lys Leu Ala Tyr Asp lie 
100 105 110 

cac aga act tac ttc ggc gag gag ttg cac ate att tgc ate ctg aaa 3 84 
His Arg Thr Tyr Phe Gly Glu Glu Leu His lie lie Cys lie Leu Lys 
115 120 125 

ggc tct cgc ggc ttc ttc aac ctt ctg ate gac tac ctt gcc acc ata 432 
Gly Ser Arg Gly Phe Phe Asn Leu Leu lie Asp Tyr Leu Ala Thr lie 
130 135 140 

cag aag tac agt ggt cgt gag tec age gtg ccc ccc ttc ttc gag cac 4 80 
Gin Lys Tyr Ser Gly Arg Glu Ser Ser Val Pro Pro Phe Phe Glu His 
145 150 155 160 

tat gtc cgc ctg aag tec tac cag aac gac aac age aca ggc cag etc 528 
Tyr Val Arg Leu Lys Ser Tyr Gin Asn Asp Asn Ser Thr Gly Gin Leu 
165 170 175 

ace gtc ttg age gac gac ttg tea ate ttt cgc gac aag cac gtt ctg 576 
Thr Val Leu Ser Asp Asp Leu Ser lie Phe Arg Asp Lys His Val Leu 
180 185 190 

att gtt gag gac ate gtc gac acc ggt ttc acc etc acc gag ttc ggt 624 
lie Val Glu Asp lie Val Asp Thr Gly Phe Thr Leu Thr Glu Phe Gly 
195 200 205 

gag cgc ctg aaa gcc gtc ggt ccc aag teg atg aga ate gee ace etc 672 
Glu Arg Leu Lys Ala Val Gly Pro Lys Ser Met Arg lie Ala Thr Leu 
210 215 220 

gtc gag aag cgc aca gat cgc tec aac age ttg aag ggc gac ttc gtc 720 
Val Glu Lys Arg Thr Asp Arg Ser Asn Ser Leu Lys Gly Asp Phe Val 
225 230 235 240 

ggc ttc age att gaa gac gtc tgg ate gtt ggt tgc tgc tac gac ttc 768 
Gly Phe Ser lie Glu Asp Val Trp lie Val Gly Cys Cys Tyr Asp Phe 
245 250 255 

aac gag atg ttc cgc gac ttc gac cac gtc gcc gtc ctg age gac gcc 816 
Asn Glu Met Phe Arg Asp Phe Asp His Val Ala Val Leu Ser Asp Ala 
260 265 270 

get cgc aaa aag ttc gag aag taa 840 
Ala Axg Lys Lys Phe Glu Lys 
275 

<210> 18 
<211> 279 
<212> PRT 

<213> Toxoplasma gondii 
<400> 18 

Met Ala Ser Lys Pro lie Glu Glu Ser Arg Ser Gin Lys Arg Ser Ala 
15 10 15 

Phe Ser Asp lie Phe Cys Cys Cys Thr Pro Asn Glu Gly Ala lie Val 
20 25 30 

Pro Ser Asp Pro Met Val Ser Thr Ser Ala Pro Ala Arg Thr Ser Ala 
35 40 45 

Pro Ala Arg Ser Ser Ala Leu Gin Asp Tyr Gly Lys Gly Lys Gly Arg 
50 55 60 
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lie Glu Pro Met Tyr He Pro Asp Asn Thr Phe Tyr Asn Ala Asp Asp 
65 70 75 80 

Phe Leu Val Pro Pro His Cys Lys Pro Tyr He Asp Lys He Leu Leu 
85 90 95 

Pro Gly GLy Leu Val Lys Asp Arg Val Glu Lys Leu Ala Tyr Asp He 
100 105 110 

His Arg Thr Tyr Phe Gly Glu Glu Leu His He He Cys He Leu Lys 
115 120 125 

Gly Ser Arg Gly Phe Phe Asn Leu Leu He Asp Tyr Leu Ala Thr He 
130 135 140 

Gin Lys Tyr Ser Gly Arg Glu Ser Ser Val Pro Pro Phe Phe Glu His 
145 150 155 160 

Tyr Val Arg Leu Lys Ser Tyr Gin Asn Asp Asn Ser Thr Gly Gin Leu 
165 170 175 

Thr Val Leu Ser Asp Asp Leu Ser He Phe Arg Asp Lys His Val Leu 
180 185 190 

He Val Glu Asp He Val Asp Thr Gly Phe Thr Leu Thr Glu Phe Gly 
195 200 205 

Glu Arg Leu Lys Ala Val Gly Pro Lys Ser Met: Arg He Ala Thr Leu 
210 215 220 

Val Glu Lys Arg Thr Asp Arg Ser Asn Ser Leu Lys Gly Asp Phe Val 
225 230 235 240 

Gly Phe Ser He Glu Asp Val Trp He Val Gly Cys Cys Tyr Asp Phe 
245 250 255 

Asn Glu Met Phe Arg Asp Phe Asp His Val Ala Val Leu Ser Asp Ala 
260 265 270 

Ala Arg Lys Lys Phe Glu Lys 
275 

<210> 19 
<211> 459 
<212> DNA 

<213> Escherichia coli 

<220> 

<221> COS 

<222> (l).-{456) 

<2 2 3> coding for xanthine-guanine phosphoribosyl 
transferase (gpt) 

<400> 19 

atg age gaa aaa tac ate gtc acc tgg gac atg ttg cag ate cat gca 48 

Met Ser Glu Lys Tyr He Val Thr Trp Asp Met Leu Gin He His Ala 
15 10 15 

cgt aaa etc gca age cga ctg atg cct tct gaa caa tgg aaa ggc att 96 
Arg Lys Leu Ala Ser Arg Leu Met Pro Ser Glu Gin Trp Lys Gly He 
20 25 30 

att gcc gta age cgt ggc ggt ctg gta ccg ggt gcg tta ctg gcg cgt 144 
He Ala Val Ser Arg Gly Gly Leu Val Pro Gly Ala Leu Leu Ala Arg 
35 40 45 
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gaa ctg ggt att cgt cat g-tc gat acc gtt tgt att tec age tac gat 192 
Glu I*eu Gly lie Arg His Val Asp Thr Val Cys He Ser Ser Tyr Asp 
50 55 60 

cac gac aac cag cgc gag ctt aaa gtg ctg aaa cgc gca gaa ggc gat 240 
His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys Arg Ala Glu Gly Asp 
65 70 75 80 

ggc gaa ggc ttc ate gtt att gat gac ctg gtg gat acc ggt ggt act 288 
Gly Glu Gly Fhe lie Val He Asp Asp Leu Val Asp Thr Gly Gly Thr 
85 90 95 

gcg gtt gcg att cgt gaa atg tat cca aaa gcg cac ttt gtc acc ate 336 
Ala Val Ala He Arg Glu Met Tyr Pro Lye Ala His Phe Val Thr lie 
100 105 110 

ttc gca aaa cog get ggt cgt cog ctg gtt gat gac tat gtt gtt gat 384 
Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp Asp Tyr Val Val Asp 
115 120 125 

ate ccg caa gat acc tgg att gaa cag ceg tgg gat atg ggc gtc gta 432 
He Pro Gin Asp Thr Trp He Glu Gin Pro Trp Asp Met Gly Val Val 
130 135 140 

ttc gtc ccg cca ate tec ggt cgc taa 459 
Phe Val Pro Pro He Ser Gly Arg 
145 150 

<210> 20 
<211> 152 
<212> PRT 

<213> Bscherichia coll 
<400> 20 

Met Ser Glu Lys Tyr He Val Thr Trp Asp Met Leu Gin He His Ala 
15 10 15 

Arg Lys Leu Ala Ser Arg Leu Met Pro Ser Glu Gin Trp Lys Gly He 
20 25 30 

He Ala Val Ser Arg Gly Gly Leu Val Pro Gly Ala Leu Leu Ala Arg 
35 40 45 

Glu Leu Gly He Arg His Val Asp Thr Val Cys He Ser Ser Tyr Asp 
50 55 60 

His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys Arg Ala Glu Gly Asp 
65 70 75 80 

Gly Glu Gly Phe He Val He Asp Asp Leu Val Asp Thr Gly Gly Thr 
85 90 95 

Ala Val Ala He Arg Glu Met Tyr Pro Lys Ala His Phe Val Thr He 
100 105 110 

Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp Asp Tyr Val Val Asp 
115 120 125 

He Pro Gin Asp Thr Trp He Glu Gin Pro Trp Asp Met Gly Val Val 
130 135 140 

Phe Val Pro Pro He Ser Gly Arg 
145 150 
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<210> 21 
<211> 459 
<212> DNA 

<213> Escherichia coll 

<220> 

<221> CDS 

<222> (1)..(456) 

<223> coding for xanthine-guanine phosphorlbosyX 
-transferase (gpt) 

<400> 21 

atg age gaa aaa tac ate gtc acc tgg gac atg ttg cag ate cat gca 48 

Met Ser Glu Xys Tyr Xle Val Thr Trp Asp Met Leu Gin lie His Ala 

15 10 15 

cgt aaa etc gca age ega ctg atg cct tct gaa caa tgg aaa ggc att 96 
Arg Lys Leu Ala Ser Arg Leu Met Pro Ser Glu Gin Trp Lys Gly lie 
20 25 30 

att gcc gta age cgt ggc ggt ctg gt.a ccg ggt gcg tta ctg gcg cgt 144 
Xle Ala Val Ser Arg Gly Gly Leu Val Pro Gly Ala Leu Leu Ala Arg 
35 40 45 

gaa ctg ggt att cgt cat gtc gat acc gtt tgt att tec ago tac gat 192 
Glu Leu Gly Zle Arg His Val Asp Thr Val Cys lie Ser Ser Tyr Asp 
50 55 60 

cac gac aac cag cgc gag ctt aaa gtg ctg aaa cgc gca gaa ggc gat 240 
His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys Arg Ala Glu Gly Asp 
65 70 75 80 

ggc gaa ggc tte ate gtt att gat gac ctg gtg gat acc ggt ggt act 288 
Gly Glu Gly Phe lie Val lie Asp Asp Leu Val Asp Thr Gly Gly Thr 
85 90 95 

gcg gtt gcg att cgt gaa atg tat cca aaa gcg cac ttt gtc acc ate 336 
Ala Val Ala lie Arg Glu Met Tyr Pro Lys Ala His Phe Val Thr lie 
100 105 110 

tte gca aaa ccg get ggt cgt ccg ctg gtt gat gac tat gtt gtt gat 384 
Phe Ala Lys Pro Ala Gly Arg Pro lieu Val Asp Asp Tyr Val Val Asp 
115 120 125 

ate ccg caa gat acc tgg att gaa cag ccg tgg gat atg ggc gtc gt.a 432 
lie Pro Gin Asp Thr Trp He Glu Gin Pro Trp Asp Met Gly Val Val 
130 135 140 

tte gtc ccg cca ate tec ggt cgc taa 459 
Phe Val Pro Pro He Ser Gly Arg 
145 150 

<210> 22 
<211> 152 
<212> PRT 

<213> Escherichia coli 
<400> 22 

Met Ser Glu Lys Tyr lie Val Thr Trp Asp Met Leu Gin He His Ala 
15 10 15 

Arg Lys Leu Ala Ser Arg Leu Met Pro Ser Glu Gin Trp Lys Gly He 
20 25 30 
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lie Ala Val Ser Arg Gly Gly Leu Val Pro Gly Ala Leu Leu Ala Arg 
35 40 45 

Glu Leu Gly He Arg His Val Asp Thr Val Cys He Ser Ser Tyr Asp 

50 55 60 

His Asp Asn Gin Arg Glu Leu Lys Val Leu Lys Arg Ala Glu Gly Asp 
65 70 75 80 

Gly Glu Gly Phe He Val He Asp Asp Leu Val Asp Thr Gly Gly Thr 
65 90 95 

Ala Val Ala He Arg Glu Met Tyr Pro Lys Ala His Phe Val Thr He 
100 105 110 

Phe Ala Lys Pro Ala Gly Arg Pro Leu Val Asp Asp Tyr Val Val Asp 
115 120 125 

He Pro Gin Asp Thr Trp He Glu Gin Pro Trp Asp Met Gly Val Val 
130 135 140 

Phe Val Pro Pro He Ser Gly Arg 
145 150 

<210> 23 
<211> 720 
<212> DNA 

<213> Escherichia coli 

<220> 

<221> CDS 

<222> (1).-(717) 

<223> coding for purine nucleoside phosphorylase (deoD) 
<400> 23 

atg get acc cca cac att aat gca gaa atg ggc gat ttc get gac gta 48 
Met Ala Thr Pro His He Asn Ala Glu Met Gly Asp Phe Ala Asp Val 
1 5 10 15 

gtt ttg atg cca ggc gac ccg ctg cgt gcg aag tat att get gaa act 96 
Val Leu Met Pro Gly Asp Pro Leu Arg Ala Lys Tyr He Ala Glu Thr 
20 25 30 

ttc ctt gaa gat gcc cgt gaa gtg aac aac gtt cgc ggt atg ctg ggc 144 
Phe Leu Glu Asp Ala Arg Glu Val Asn Asn Val Arg Gly Met Leu Gly 

35 40 45 

ttc acc ggt act tac aaa ggc cgc aaa att tec gta atg ggt cac ggt 192 
Phe Thr Gly Thr Tyr Lys Gly Arg Lys He Ser Val Met Gly His Gly 
50 55 60 

atg ggt ate ccg tec tgc tec ate tac ace aaa gaa ctg ate acc gat 240 
Met Gly He Pro Ser Cys Ser He Tyr Thr Lys Glu Leu He Thr Asp 
65 70 75 80 

ttc ggc gtg aag aaa att ate cgc gtg ggt tec tgt ggc gca gtt ctg 288 
Phe Gly Val Lys Lys He He Arg Val Gly Ser Cys Gly Ala Val Leu 
85 90 95 

ccg cac gta aaa ctg cgc gac gtc gtt ate ggt atg ggt gcc tgc ace 336 
Pro His Val Lys Leu Arg Asp Val Val He Gly Met Gly Ala Cys Thr 

100 105 110 

gat tec aaa gtt aac cgc ate cgt ttt aaa gac cat gac ttt gcc get 384 
Asp Ser Lys Val Asn Arg He Arg Phe Lys Asp His Asp Phe Ala Ala 
115 120 125 
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ate get gac ttc gac atg gtg cgt aac gca gta gat gca get aaa gca 
lie Ala Asp Phe Asp Met Val Arg Asn Ala Val Asp Ala Ala Lys Ala 

130 135 140 

ctg ggt att gat get cgc gtg ggt aac ctg ttc tec get gac ctg ttc 
Leu Gly lie Asp Ala Arg Val Gly Asn Leu Phe Ser Ala Asp Leu Phe 
145 150 155 160 

tac tct ccg gac ggc gaa atg ttc gac gtg atg gaa aaa tac ggc att 
Tyr Ser Pro Asp Gly Glu Met Phe Asp Val Met Glu Lys Tyr Gly He 
165 IVO 175 

etc ggc gtg gaa atg gaa gcg get ggt ate tac ggc gtc get gca gaa 
Leu Gly Val Glu Met Glu Ala Ala Gly He Tyr Gly Val Ala Ala Glu 
IBO 185 190 

ttt ggc gcg aaa gee ctg acc ate tgc ace gta tct gac cac ate cgc 
Phe Gly Ala Lys Ala Leu Thr He Cys Thr Val Ser Asp His He Arg 
195 200 205 

act cac gag cag acc act gee get gag cgt cag act acc ttc aac gac 
Thr His Glu Gin Thr Thr Ala Ala Glu Arg Gin Thr Thr Phe Asn Asp 

210 215 220 

atg ate aaa ate gca ctg gaa tec gtt ctg ctg ggc gat aaa gag taa 
Met He Lys He Ala Leu Glu Ser Val Leu Leu Gly Asp Lys Glu 
225 230 235 

<210> 24 
<21I> 239 
<212> PRT 

<213> Escherichia coli 

<400> 24 , _ 

Met Ala Thr Pro His He Asn Ala Glu Met Gly Asp Phe Ala Asp Val 
15 10 15 

val Leu Met Pro Gly Asp Pro Leu Arg Ala Lys Tyr He Ala Glu Thr 
20 25 30 

Phe Leu Glu Asp Ala Arg Glu Val Asn Asn Val Arg Gly Met Leu Gly 
35 40 45 

Phe Thr Gly Thr Tyr Lys Gly Arg Lys He Ser Val Met Gly His Gly 

50 55 60 

Met Gly He Pro Ser Cys Ser He Tyr Thr Lys Glu Leu He Thr Asp 
65 70 75 80 

Phe Gly Val Lys Lys He He Arg Val Gly. Ser Cys Gly Ala Val Leu 
85 90 95 

Pro His val Lys Leu Arg Asp Val Val He Gly Met Gly Ala Cys Thr 
100 105 110 

Asp Ser Lys Val Asn Arg He Arg Phe Lys Asp His Asp Phe Ala Ala 
115 120 125 

He Ala Asp Phe Asp Met Val Arg Asn Ala Val Asp Ala Ala Lys Ala 

130 135 140 

Leu Gly He Asp Ala Arg Val Gly Asn Leu Phe Ser Ala Asp Leu Phe 
145 150 155 160 

Tvr Ser Pro Asp Gly Glu Met Phe Asp Val Met Glu Lys Tyr Gly He 
165 170 175 
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Leu Gly Val Glu Met Glu Ala Ala Gly He Tyr Gly Val Ala Ala Glu ' 
180 IBS 190 

Phe Gly Ala Lys Ala Leu Thr He Cys Thr Val Ser Asp His He Arg 
195 200 205 

Thr His Glu Gin Thr Thr Ala Ala Glu Arg Gin Thr Thr Phe Asn Asp 
210 215 220 

Met He I.ys He Ala Leu Glu Ser Val Leu Leu Gly Asp Lys Glu 
225 230 235 



<210> 25 
<211> 1545 
<212> 0»A 

<213> Burkholderia caryophylli 

<220> 

<221> CDS 

<222> (1)..(1542) 

<223> coding for phosphonate monoester hydrolase (pehA) 
<400> 25 

atg acc aga aaa aat gt.c ctg ctt ate gtc gtt gat caa tgg cga gca 48 
Met Thr Arg Lys Asn Val Leu Leu He Val Val Asp Gin Trp Arg Ala 
15 10 15 

gat ttt ate cct cac ctg atg egg gcg gag ggg cgc gaa cct ttc ctt 96 
Asp Phe He Pro His Leu Met Arg Ala Glu Gly Arg Glu Pro Phe Leu 
20 25 30 

aaa act ccc aat ctt gat cgt ctt tgc egg gaa ggc ttg acc ttc cgc 144 
Lys Thr Pro Asn Leu Asp Arg Leu Cys Arg Glu Gly Leu Thr Phe Arg 
35 40 45 

aat cat gtc acg acg tgc gtg ccg tgt ggt ccg gca agg gca age ctg 192 
Asn His Val Thr Thr Cys Val Pro Cys Gly Pro Ala Arg Ala Ser Leu 
50 55 60 

ctg acg ggc etc tac ctg atg aac cac egg gcg gtg cag aac act gtt 240 
Leu Thr Gly Leu Tyr Leu Met Asn . His Arg Ala Val Gin Asn Thr Val 
65 70 75 80 

ccg ctt gac cag cgc cat eta aac ctt ggc aag gcc ctg cgc gcc att 288 
Pro Leu Asp Gin Arg His Leu Asn Leu Gly Lys Ala Leu Arg Ala He 
85 90 95 

ggc tac gat ccc gcg etc att ggt tac acc acc acg aca cct gat ccg 336 
Gly Tyr Asp Pro Ala Leu He Gly Tyr Thr Thr Thr Thr Pro Asp Pro 
100 105 110 

cgc aca acc tet gca agg gat ccg cgt ttc acg gtc ctg ggc gac ate 384 
Arg Thr Thr Ser Ala Arg Asp Pro Arg Phe Thr Val Leu Gly Asp He 
115 120 125 

atg gac ggc ttt cgt teg gtc ggc gca ttc gag ccc aat atg gag ggg 432 
Met Asp Gly Phe Arg Ser Val Gly Ala Phe Glu Pro Asn Met Glu Gly 
130 135 140 

tat ttt ggc tgg gtg gcg cag aac ggc ttc gaa ctg cca gag aac cgc 4 80 
Tyr Phe Gly Trp Val Ala Gin Asn Gly Phe Glu Leu Pro Glu Asn Arg 
145 150 155 160 
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gaa gat ate tgg ctg ccg gaa ggt gaa cat tec gtt ccc ggt get acc 
Glu Asp lie Trp Leu Pro Glu Gly Glu His Ser Val Pro Gly Ala Thr 
165 170 175 

gac aaa ccg teg cgc att ccg aag gaa ttt teg gat teg aca ttc ttc 
Asp Lys Pro Ser Arg He Pro Lys Glu Phe Ser Asp Ser Thr Phe Phe 

180 185 190 

acg gag cgc gcc ctg aca tat ctg aag ggc agg gac ggc aag cct ttc 
Thr Glu Arg Ala Leu Thr Tyr Leu Lys Gly Arg Asp Gly Lys Pro Phe 
195 200 205 

ttc gta gcc tec 
Phe Val Ala Ser 



ttc ctg cat ctt ggc tat tat cgc ccg cat ccg cct 
Phe Leu His Leu Gly Tyr Tyr Arg Pro His Pro Pro 
210 215 220 

gcg ccc tac cat gcg atg tac aaa gcc gaa gat atg 
Ala Pro Tyr His Ala Met Tyr Lys Ala Glu Asp Met 
225 230 235 

cgt gcg gag aat ccg gat gcc gaa gcg gca cag cat 
Arg Ala Glu Asn Pro Asp Ala Glu Ala Ala Gin His 
245 250 



cct gcg cct ata 
Pro Ala Pro He 
240 

ccg etc atg aag 
Pro Leu Met Lys 
255 



cac tat ate gac cac ate aga cgc ggc teg ttc ttc cat ggc gcg gaa 
His Tyr He Asp His He Arg Arg Gly Ser Phe Phe His Gly Ala Glu 
260 265 270 

ggc teg gga gca acg ctt gat gaa ggc gaa att cgc cag atg cgc get 
Gly Ser Gly Ala Thr Leu Asp Glu Gly Glu He Arg Gin Met Arg Ala 

275 280 285 

aca tat tgc gga ctg ate acc gag ate gac gat tgt ctg ggg agg gtc 
Thr Tyr Cys Gly Leu He Thr Glu He Asp Asp Cys Leu Gly Arg Val 

290 295 300 

ttt gcc tat etc gat gaa acc ggt cag tgg gac gac acg ctg att ate 
Phe Ala Tyr Leu Asp Glu Thr Gly Gin Trp Asp Asp Thr I*eu He He 
305 310 315 320 



gaa caa ctg ggc gat cat cac ctg etc ggc 
Glu Gin Leu Gly Asp His His Leu Leu Gly 
330 335 

gaa age ttc cgt att ccc ttg gtc ata aag 
Glu Ser Phe Arg He Pro Leu Val He Lys 

345 350 

cac gcc ggc cag ate gaa gaa ggc ttc tec 
His Ala Gly Gin He Glu Glu Gly Phe Ser 
360 365 

gaa age ate gac gtc atg ccg ace ate etc gaa tgg ctg ggc ggg gaa 
Glu Ser He Asp Val Met Pro Thr He Leu Glu Trp Leu Gly Gly Glu 
370 375 380 



ttc acg 
Phe Thr 

aag ate 
Lys He 

gat gcg 
Asp Ala 



age gat cat ggc 
Ser Asp His Gly 
325 

ggt tac aat gcc 
Gly Tyr Asn Ala 
340 

gga cag aac egg 
Gly Gin Asn Arg 
355 



acg cct cgc 
Thr Pro Arg 
385 

gga aag ccc 
Gly Lys Pro 

cgc gat gtc 
Arg Asp Val 



gcc tgc gac ggc cgt teg ctg ttg ccg ttt ctg get gag 
Ala Cys Asp Gly Arg Ser Leu Leu Pro Phe Leu Ala Glu 
390 395 400 

tec gac tgg cgc acg gaa eta cat tac gag ttc gat ttt 
Ser Asp Trp Arg Thr Glu Leu His Tyr Glu Phe Asp Phe 

405 410 415 

ttc tac gat cag ccg cag aac teg gtc cag ctt tec cag 
Phe Tyr Asp Gin Pro Gin Asn Ser Val Gin Leu Ser Gin 
420 425 430 
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gat gat tgc age etc tgt gtg ate gag gac gaa aac tac aag tac gtg 1344 
Asp Asp Cys Ser I*eu Cys Val lie Glu Asp Glu Asn Tyr Lys Tyr Val 
435 440 445 

cat ttt gcc gee ctg cog ccg ctg ttc ttc gat ctg aag gca gac ccg 1392 
His Phe Ala Ala Leu Pro Pro Leu Phe Phe Asp Leu Lys Ala Asp Pro 
450 455 460 

cat gaa ttc age aat ctg get ggc gat cot get tat gcg gcc etc gt.t 1440 
His Glu Phe Ser Asn Leu Ala Gly Asp Pro Ala Tyr Ala Ala Leu Val 
465 470 475 480 

cgt gac tat gee cag aag gea ttg teg tgg ega ctg tet cat gee gac 14 8 B 
Arg Asp Tyr Ala Gin Lys Ala Leu Ser Trp Arg Leu Ser His Ala Asp 
485 490 495 

egg aca etc aec eat tac aga tec age ccg caa ggg ctg aca aeg ege 1536 
Arg Thr Leu Thr His Tyr Arg Ser Ser Pro Gin Gly Leu Thr Thr Arg 
500 505 510 



aac eat tga 
Asn His 

<210> 26 
<211> 514 
<212> PRT 

<213> Burkholderia caryophylli 
<400> 26 

Met Thr Arg Lys Asn Val Leu Leu lie Val Val Asp Gin Trp Arg Ala 
1 5 10 15 

Asp Phe He Pro His Leu Met Arg Ala Glu Gly Arg Glu Pro Phe Leu 
20 25 30 

Lys Thr Pro Asn Leu Asp Arg Leu Cys Arg Glu Gly Leu Thr Phe Arg 
35 40 45 

Asn His Val Thr Thr Cys Val Pro Cys Gly Pro Ala Arg Ala Ser Leu 
50 55 60 

l^u Thr Gly Leu Tyr Leu Met Asn His Arg Ala Val Gin Asn Thr Val 
65 70 75 80 

Pro Leu Asp Gin Arg His Leu Asn Leu Gly Lys Ala Leu Arg Ala He 
85 90 95 

Gly Tyr Asp Pro Ala Leu He Gly Tyr Thr Thr Thr Thr Pro Asp Pro 
100 105 110 

Arg Thr Thr Ser Ala Arg Asp Pro Arg Phe Thr Val Leu Gly Asp He 
115 120 125 

Met Asp Gly Phe Arg Ser Val Gly Ala Phe Glu Pro Asn Met Glu Gly 
130 135 140 

Tyr Phe Gly Trp Val Ala Gin Asn Gly Phe Glu Leu Pro Glu Asn Arg 
145 150 155 160 

Glu Asp He Trp Leu Pro Glu Gly Glu His Ser Val Pro Gly Ala Thr 
165 170 175 

Asp Lys Pro Ser Arg He Pro Lys Glu Phe Ser Asp Ser Thr Phe Phe 
ISO 185 190 

Thr Glu Arg Ala Leu Thr Tyr Leu Lys Gly Arg Asp Gly Lys Pro Phe 
195 200 205 
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Phe Leu His Leu Gly Tyx Tyr Arg Pro His Pro Pro Phe Val Ala Ser 
210 215 220 

Ala Pro Tyr His Ala Met Tyr Lys Ala Glu Asp Met Pro Ala Pro lie 
225 230 235 240 

Arg Ala Glu Asn Pro Asp Ala Glu Ala Ala Gin His Pro Leu Met Lys 
245 250 255 

His Tyr lie Asp His lie Arg Arg Gly Ser Phe Phe His Gly Ala Glu 
260 265 270 

Gly Ser Gly Ala Thr Leu Asp Glu Gly Glu He Arg Gin Met Arg Ala 
275 280 285 

Thr Tyr Cys Gly Leu lie Thr Glu He Asp Asp Cys Leu Gly Arg Val 
290 295 300 

Phe Ala Tyr Leu Asp Glu Thr Gly Gin Trp Asp Asp Thr Leu He He 
305 310 315 320 

Phe TKr Ser Asp His Gly Glu Gin Leu Gly Asp His His Leu Leu Gly 

325 330 335 

Lys He Gly Tyr Asn Ala Glu Ser Phe Arg He Pro Leu Val He Lys 
340 345 350 

Asp Ala Gly Gin Asn Arg His Ala Gly Gin He Glu Glu Gly Phe Ser 
355 360 365 

Glu Ser He Asp Val Met Pro Thr He Leu Glu Trp Leu Gly Gly Glu 
370 375 380 

Thr Pro Arg Ala Cys Asp Gly Arg Ser Leu Leu Pro Phe Leu Ala Glu 
385 390 395 400 

Gly Lys Pro Ser Asp Trp Arg Thr Glu Leu His Tyr Glu Phe Asp Phe 
405 410 415 

Arg Asp Val Phe Tyr Asp Gin Pro Gin Asn Ser Val Gin Leu Ser Gin 
420 425 430 

Asp Asp Cys Ser Leu Cys Val He Glu Asp Glu Asn Tyr Lys Tyr Val 
435 440 445 

His Phe Ala Ala Leu Pro Pro Leu Phe Phe Asp Leu Lys Ala Asp Pro 
450 455 460 

His Glu Phe Ser Asn Leu Ala Gly Asp Pro Ala Tyr Ala Ala Leu Val 
465 470 475 480 

Arg Asp Tyr Ala Gin Lys Ala Leu Ser Trp Arg Leu Ser His Ala Asp 
485 490 495 

Arg Thr Leu Thr His Tyr Arg Ser Ser Pro Gin Gly Leu Thr Thr Arg 
500 505 510 

Asn His 



<210> 27 
<211> 2250 
<212> DNA 

<213> Agrobacterium rhizogenes 

<220> 
<221> CDS 
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<222> (1) • . (2247) 

<223> coding for tryptophan oxygenase (aiixl) 
<400> 27 

atg get gga tec tec ttc aca ttg cca tea act ggc tea gcg ccc ctt 48 
Met Ala Gly Ser Ser Phe Thr Leu Pro Ser Thr Gly Ser Ala Pro Leu 
15 10 15 

gat atg atg ctt ate gat gat tea gat ctg ctg caa ttg ggt etc cag 96 
Asp Met Met Leu lie Asp Asp Ser Asp Leu Leu Gin Leu Gly Leu Gin 
20 . 25 30 

cag gta ttc teg aag egg tae aca gag aca ceg cag tea cgc tac aaa 144 
Gin Val Phe Ser Lys Arg Tyr Thr Glu Thr Pro Gin Ser Arg Tyr Lys 
35 40 45 

ctg ace agg agg get tct cca gac gtc tea tct ggc gaa ggc aat gtg 192 
Leu Thr Arg Arg Ala Ser Pro Asp Val Ser Ser Gly Glu Gly Asn Val 
50 55 60 

cat gcc ctt gcg ttc ata tat gtc aac get gag acg ttg cag atg ate 240 
His Ala Leu Ala Phe He Tyr Val Asn Ala Glu Thr Leu Gin Met He 
65 70 75 80 

aaa aac get ega teg eta ace gaa gcg aac ggc gtc aaa gat ctt gtc 288 
Lys Asn Ala Arg Ser Leu Thr Glu Ala Asn Gly Val Lys Asp Leu Val 
85 90 95 

gee ate gac gtt ceg cca ttt cga aac gac ttc tea aga gcg eta etc 336 
Ala He Asp Val Pro Pro Phe Arg Asn Asp Phe Ser Arg Ala Leu Leu 
100 105 110 

ctt caa gtg ate aac ttg ttg gga aac aac cga aat gcc gat gac gat 384 
Leu Gin Val lie Asn Leu Leu Gly Asn Asn Arg Asn Ala Asp Asp Asp 
115 120 125 

ctt agt cac ttc ata gca gtt get etc cca aac age gcc cgc tct aag 432 
Leu Ser His Phe He Ala Val Ala Leu Pro Asn Ser Ala Arg Ser Lys 
130 135 140 

ate eta ace acg gca ceg ttc gaa gga age ttg tea gaa aac ttc agg 480 
He Leu Thr Thr Ala Pro Phe Glu Gly Ser Leu Ser Glu Asn Phe Arg 

145 150 155 160 

ggg ttc ceg ate act cgt gaa gga aat gtg gca tgt gaa gtg eta gcc 528 
Gly Phe Pro He Thr Arg Glu Gly Asn Val Ala Cys Glu Val Leu Ala 
165 170 175 

tat ggg aat aac ttg atg ccc aag gcc tgc tec gat tec ttt cca acc 576 
Tyr Gly Asn Asn Leu Met Pro Lys Ala Cys Ser Asp Ser Phe Pro Thr 
180 185 190 

gtg gat ctt ctt tat gac tat ggc aag ttc ttc gag agt tgc gcg gcc 624 
Val Asp Leu Leu Tyr Asp Tyr Gly Lys Phe Phe Glu Ser Cys Ala Ala 
195 200 205 

gat gga cgt ate ggt tat ttt cct gaa ggc gtt acg aaa cct aaa gtg 672 

Asp Gly Arg He Gly Tyr Phe Pro Glu Gly Val Thr Lys Pro Lys Val 

210 215 220 

get ata att ggc gca ggc ttt tec ggg etc gtt gca gcg age gaa eta 720 

Ala He He Gly Ala Gly Phe Ser Gly Leu Val Ala Ala Ser Glu Leu 
225 230 235 240 

ctt cat gca ggg gta gac gat gtt acg gtg tat gag gcg agt gat egg 768 
Leu Bis Ala Gly Val Asp Asp Val Thr Val Tyr Glu Ala Ser Asp Arg 
245 250 255 
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ctt gga gga aag eta tgg tea eac gga ttt aag agt get cca aat gtg 816 
Leu Gly Gly Lys Leu Trp Ser His Gly Plie Lys Ser Ala Pro Asn Val 
260 265 270 

ata gee gag atg ggg gee atg cgt ttt ccg cga agt gaa tea tgc ttg 864 
lie Ala Glu Met Gly Ala Met Arg Phe Pro Arg Ser Glu Ser Cys Leu 
275 280 285 

ttc ttc tat etc aaa aag cac gga ctg gae tec gtt ggt ctg ttc ceg 912 
Phe Phe Tyr Leu Lys Lys His Gly Leu Asp Ser Val Gly Leu Phe Pro 
290 295 300 

aat ccg gga agt gtc gat acc gca ttg ttc tac agg ggc cgt caa tat 960 
Asn Pro Gly Ser Val Asp Thr Ala Leu Phe Tyr Arg Gly Arg Gin Tyr 
305 310 315 320 

ate tgg aaa gcg gga gag gag cca ccg gag ctg ttt cgt cgt gtg cac 1008 
lie Trp Lys Ala Gly Glu Glu Pro Pro Glu Leu Phe Arg Arg Val His 
325 330 335 

cat gga tgg cgc gca ttt ttg caa gat ggc tat etc cat gat gga gtc 1056 
His Gly Trp Arg Ala Phe Leu Gin Asp Gly Tyr Leu His Asp Gly Val 
340 345 350 

atg ttg gcg tea ceg tta gca att gtt gae gee ttg aat tta ggg eat 1104 
Met Leu Ala Ser Pro Leu Ala lie Val Asp Ala Leu Asn Leu Gly His 
355 360 365 

eta eag cag gcg cat ggc ttc tgg caa tct tgg etc aca tat ttt gag 1152 
Leu Gin Gin Ala His Gly Phe Trp Gin Ser Trp Leu Thr Tyr Phe Glu 
370 375 380 

cga gag tct ttc tct tct ggc ate gaa aaa atg ttc ttg ggc aat cat 1200 
Arg Glu Ser Phe Ser Ser Gly lie Glu Lys Met Phe Leu Gly Asn His 
385 390 395 400 

ect ccg ggg ggt gaa caa tgg aat tec eta gat gae ttg gat ett ttc 1248 
Pro Pro Gly Gly Glu Gin Trp Asn Ser Leu Asp Asp Leu Asp Leu Phe 
405 410 415 

aaa gcg ctg ggt att gga tee ggc gga ttc ggc ect gta ttt gaa agt 1296 
Lys Ala Leu Gly He Gly Ser Gly Gly Phe Gly Pro Val Phe Glu Ser 
420 425 430 

ggg ttt ate gag ate ett cgc tta gtc gtc aac ggg tat gag gat aac 1344 
Gly Phe He Glu lie Leu Arg Leu Val Val Asn Gly Tyr Glu Asp Asn 
435 440 445 

gtg egg ctg agt tac gaa gga att tct gag ctg ect cat agg ate gee 1392 
Val Arg Leu Ser Tyr Glu Gly He Ser Glu Leu Pro His Arg He Ala 
450 455 460 

tea cag gta att aac ggc aga tct att cgc gag cgt aca att cac gtt 1440 
Ser Gin Val He Asn Gly Arg Ser He Arg Glu Arg Thr He His Val 
465 470 475 480 

caa gtc gag cag att gat aga gag gag gat aaa ata aat ate aag ate 1488 
Gin Val Glu Gin He Asp Arg Glu Glu Asp Lys He Asn He Lys He 
485 490 495 

aaa gga gga aag gtt gag gtc tat gat cga gta ctg gtt aca tec ggg 1536 
Lys Gly Gly Lys Val Glu Val Tyr Asp Arg Val Leu Val Thr Ser Gly 
500 505 510 

ttt gcg aac ate gaa atg cgc cat etc ctg aca tea age aac gca ttc 1584 
Phe Ala Asn He Glu Met Arg His Leu Leu Thr Ser Ser Asn Ala Phe 
515 520 525 
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ttc cat gca gat gta age cat gca ata ggg aac agt cat atg act ggt 1632 
Phe Bis Ala Asp Val Ser His Ala lie Gly Asn Ser His Met Thr Gly 
530 535 540 

gcg tea aaa etg ttc ttg ctg act aac gaa aaa ttc tgg eta caa cat 1680 
Ala Ser Lys Leu Phe Leu Leu Thr Asn Glu Lys Phe Trp Leu Gin His 
545 550 555 560 

cat ttg cca teg tgc ata etc acc ace ggc gtt gca aag gca gtt tat 1728 
His Leu Pro Ser Cys lie Leu Thr Thr Gly Val Ala Lys Ala Val Tyr 
565 570 575 

tgc tta gac tat gat ccg cga gat cca age ggc aaa gga ctg gtg ttg 1776 
Cys Leu Asp Tyr Asp Pro Arg Asp Pro Ser Gly Lys Gly Leu Val Leu 
580 585 590 

ata age tat act tgg gag gat gac tea cat aag etc eta gee gtc ccc 1824 
lie Ser Tyr Thr Trp Glu Asp Asp Ser His Lys Leu Leu Ala Val Pro 
595 600 605 

gac aaa aga gaa agg ttc gca teg ctg cag egc gat att ggg agg gca 1872 
Asp Lys Arg Glu Arg Phe Ala Ser Leu Gin Arg Asp lie Gly Arg Ala 
610 615 620 

ttc cca gat ttt gcc aag cac eta act cct gca gac ggg aac tat gat 1920 
Phe Pro Asp Phe Ala Lys His Leu Thr Pro Ala Asp Gly Asn Tyr Asp 
625 630 635 640 

gat aat ate gtt caa cat gat tgg etg act gat ccc cac get ggc gga 1968 
Asp Asn lie Val Gin His Asp Trp Leu Thr Asp Pro His Ala Gly Gly 
645 650 655 

gcg ttt aaa ctg aac cgc aga ggc aac gac gta tat tea gaa agg ctt 2016 
Ala Phe Lys Leu Asn Arg Arg Gly Asn Asp Val Tyr Ser Glu Arg Leu 
660 665 670 

ttc ttt cag ccc ttt gac gta atg cat ccc gcg gac gat aag gga ctt 2064 
Phe Phe Gin Pro Phe Asp Val Met His Pro Ala Asp Asp Lys Gly Leu 
675 660 685 

tac ttg gcc ggt tgt age tgt tec ttc acc gga ggg tgg gtt cat ggt 2112 
Tyr Leu Ala Gly Cys Ser Cys Ser Phe Thr Gly Gly Trp Val His Gly 
690 695 700 

gcc att cag acc gca tgc aac get acg tgt gcg ate att tat ggt tec 2160 
Ala lie Gin Thr Ala Cys Asn Ala Thr Cys Ala lie lie Tyr Gly Ser 
705 710 715 720 

gga cac ctg caa gag eta ate cac tgg cga cac etc aaa gaa ggt aat 2208 
Gly His Leu Gin Glu Leu lie His Trp Arg His Leu Lys Glu Gly Asn 
725 730 735 

cca ctg gcg cac get tgg aag egg tat agg tat caa gcg tga 2250 
Pro Leu Ala His Ala Trp Lys Arg Tyr Arg Tyr Gin Ala 
740 745 

<210> 28 
<211> 749 
<212> PRT 

<213> Agrobacteriuin rhisogenes 
<400> 28 

Met Ala Gly Ser Ser Phe Thr Leu Pro Ser Thr Gly Ser Ala Pro Leu 
15 10 15 
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Asp Met: Met l^eu He Asp Asp Ser Asp Leu I«eu Gin lieu Gly Leu Gin 
20 25 30 

Gin Val Phe Ser Lys Arg Tyr Thr Glu Thr Pro Gin Ser Arg Tyr Lys 
35 40 45 

Leu Thr Arg Arg Ala Ser Pro Asp Val Ser Ser Gly Glu Gly Asn Val 
50 55 60 

His Ala Leu Ala Phe He Tyr Val Asn Ala Glu Thr Leu Gin Met He 
65 70 75 80 

Lys Asn Ala Arg Ser Leu Thr Glu Ala Asn Gly Val Lys Asp Leu Val 
85 90 95 

Ala He Asp Val Pro Pro Phe Arg Asn Asp Phe Ser Arg Ala Leu Leu 
100 105 110 

Leu Gin Val He Asn Leu Leu Gly Asn Asn Arg Asn Ala Asp Asp Asp 
115 120 125 

Leu Ser His Phe He Ala Val Ala Leu Pro Asn Ser Ala Arg Ser Lys 
130 135 140 

He Leu Thr Thr Ala Pro Phe Glu Gly Ser Leu Ser Glu Asn Phe Arg 
145 150 155 160 

Gly Phe Pro He Thr Arg Glu Gly Asn Val Ala Cys Glu Val Leu Ala 
165 170 175 

Tyr Gly Asn Asn Leu Met Pro Lys Ala Cys Ser Asp Ser Phe Pro Thr 
160 185 190 

Val Asp Leu Leu Tyr Asp Tyr Gly Lys Phe Phe Glu Ser Cys Ala Ala 
195 200 205 

Asp Gly Arg He Gly Tyr Phe Pro Glu Gly Val Thr Lys Pro Lys Val 
210 215 220 

Ala He He Gly Ala Gly Phe Ser Gly Leu Val Ala Ala Ser Glu Leu 
225 230 235 240 

Leu His Ala Gly Val Asp Asp Val Thr Val Tyr Glu Ala Ser Asp Arg 
245 250 255 

Leu Gly Gly Lys Leu Trp Ser His Gly Phe Lys Ser Ala Pro Asn Val 
260 265 270 

He Ala Glu Met Gly Ala Met Arg Phe Pro Arg Ser Glu Ser Cys Leu 
275 280 285 

Phe Phe Tyr Leu Lys Lys His Gly Leu Asp Ser Val Gly Leu Phe Pro 
290 295 300 

Asn Pro Gly Ser Val Asp Thr Ala Leu Phe Tyr Arg Gly Arg Gin Tyr 
305 310 315 320 

He Trp Lys Ala Gly Glu Glu Pro Pro Glu Leu Phe Arg Arg Val His 
325 330 335 

His Gly Trp Arg Ala Phe Leu Gin Asp Gly Tyr Leu His Asp Gly Val 
340 345 350 

Met Leu Ala Ser Pro Leu Ala He Val Asp Ala Leu Asn Leu Gly His 
355 360 365 

Leu Gin Gin Ala His Gly Phe Trp Gin Ser Trp Leu Thr Tyr Phe Glu 
370 375 380 
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Arg Glu Ser Phe Ser Ser Gly lie Glu Lys Met Phe I^u Gly Asn His 
385 390 395 400 

Pro Pro Gly Gly Glu Gin Trp Asn Ser Leu Asp Asp Leu Asp Leu Phe 
405 410 415 

Lys Ala Leu Gly lie Gly Ser Gly Gly Phe Gly Pro Val Phe Glu Ser 
420 425 430 

Gly Phe lie Glu lie Leu Arg Leu Val Val Asn Gly Tyr Glu Asp Asn 
435 440 445 

Val Arg Leu Ser Tyr Glu Gly Xle Ser Glu Leu Pro His Arg lie Ala 
450 455 460 

Ser Gin Val He Asn Gly Arg Ser Xle Arg Glu Arg Thr He His Val 
465 470 475 480 

Gin Val Glu Gin He Asp Arg Glu Glu Asp Lys He Asn He Lys He 
485 490 495 

Lys Gly Gly Lys Val Glu Val Tyr Asp Arg Val Leu Val Thr Ser Gly 
500 505 510 

Phe Ala Asn He Glu Me-t Arg His Leu Leu Thr Ser Ser Asn Ala Phe 
515 520 525 

Phe His Ala Asp Val Ser His Ala He Gly Asn Ser His Met Thr Gly 
530 535 540 

Ala Ser Lys Leu Phe Leu Leu Thr Asn Glu Lys Phe Trp Leu Gin His 
545 550 555 560 

His Leu Pro Ser Cys He Leu Thr Thr Gly Val Ala Lys Ala Val Tyr 
565 570 575 

Cys Leu Asp Tyr Asp Pro Arg Asp Pro Ser Gly Lys Gly Leu Val Leu 
580 585 590 

He Ser Tyr Thr Trp Glu Asp Asp Ser His Lys Leu Leu Ala Val Pro 
595 600 605 

Asp Lys Arg Glu Arg Phe Ala Ser Leu Gin Arg Asp He Gly Arg Ala 
610 615 620 

Phe Pro Asp Phe Ala Lys His Leu Thr Pro Ala Asp Gly Asn Tyr Asp 
625 630 635 640 

Asp Asn He Val Gin His Asp Trp Leu Thr Asp Pro His Ala Gly Gly 
645 650 655 

Ala Phe Lys Leu Asn Arg Arg Gly Asn Asp Val Tyr Ser Glu Arg Leu 
660 665 670 

Phe Phe Gin Pro Phe Asp Val Met His Pro Ala Asp Asp Lys Gly Leu 
675 680 685 

Tyr Leu Ala Gly Cys Ser Cys Ser Phe Thr Gly Gly Trp Val His Gly 
690 695 700 

Ala He Gin Thr Ala Cys Asn Ala Thr Cys Ala He He Tyr Gly Ser 
705 710 715 720 

Gly His Leu Gin Glu Leu He His Trp Arg His Leu Lys Glu Gly Asn 
725 730 735 

Pro I»eu Ala His Ala Trp Lys Arg Tyr Arg Tyr Gin Ala 
740 745 



CA 02493364 2005-01-21 



PF 53790 



40 



<210> 29 
<211> 1401 
<212> 0NA 

<213> Agrobacteriism rhizogenes 

<220> 

<221> CDS 

<222> (1)..(1398) 

<2 2 3> coding for Indoleace-beunide hydrolase 
<400> 29 

atg gtg acc etc tec teg ate acc gag acg ctt aaa tgt etc agg gaa 48 
Met Val Thr Leu Ser Ser lie TKr Glu Thr I»eu Lys Cys Leu Arg Glu 
15 10 15 

aga aaa tac teg tgc ttt gag tta ate gaa acg ata ata gcc cgc tgt 96 
Arg Lys Tyr Ser Cys Phe Glu Leu lie Glu Thr lie lie Ala Arg Cys 
20 25 30 

gaa gca gca aga tec tta aac gcc ttt ctg gaa acc gac tgg gcg cac 144 
Glu Ala Ala Arg Ser Leu Asn Ala Phe Leu Glu Thr Asp Trp Ala His 
35 40 45 

eta egg tgg act gcc age aaa ate gat caa cac gga ggt gcc ggt gtt 192 
Leu Arg Trp Thr Ala Ser Lys lie Asp Gin His Gly Gly Ala Gly Val 
50 55 60 

ggc eta get ggc gtt ccc eta tgc ttt aaa gcg aat att gcg aca ggc 24 0 
Gly Leu Ala Gly Val Pro Leu Cys Phe Lys Ala Asn lie Ala Thr Gly 
65 70 75 80 

agg ttc gcc gcg acc get ggt acg cea ggc tta eag aac cac aaa ccc 288 
Arg Phe Ala Ala Thr Ala Gly Thr Pro Gly Leu Gin Asn His Lys Pro 
85 90 95 

aag acg ect gcc gga gtt gca ega caa ctt etc gcg get ggg gca ctg 336 
Lys Thr Pro Ala Gly Val Ala Arg Gin Leu Leu Ala Ala Gly Ala Leu 
100 105 110 

ect ggc get teg gga aac atg cac gaa ttg tct ttt ggg ate acg age 384 
Pro Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly lie Thr Ser 
115 120 125 

aac aac ttc gcc aca ggc gcc gta cga aac ccg tgg aac ect agt etc 432 
Asn Asn Phe Ala Thr Gly Ala Val Arg Asn Fro Trp Asn Pro Ser Leu 
130 135 140 

ate cea ggg gga tea agt ggg ggt gtg gcc gcc gcg gtg gcc ggc cga 480 
lie Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Gly Arg 
145 150 155 160 

ttg atg ctg ggc ggc gtc gga act gac acg gga gcg teg gtc cgt tta 528 
Leu Met Leu Gly Gly Val Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 
165 170 175 

ccg gcc gee ttg tgc ggc gtg gtg ggg ttt cgt ect acc gtg ggg cga 576 
Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Val Gly Arg 
180 185 190 

tat cea acg gac gga ata gtt ccg gta age ccc acc egg gac acc ect 624 
Tyr Pro Thr Asp Gly lie Val Pro Val Ser Pro Thr Arg Asp Thr Pro 
195 200 205 
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ggc gtt ate gca cag aat gtt ccg gac gtg att ctt ctt gac ggt ate 672 
Gly Val lie Ala Gin Asn Val Pro Asp Val lie Leu Leu Asp Gly lie 
210 215 220 

att tgc ggg aga ccg ccg gtt aat caa acg gtc cgc ctg aag ggg ctg 720 
lie Cys Gly Arg Pro Pro Val Asn Gin Thr Val Arg Leu Lys Gly Leu 
225 230 235 240 

cgt ata ggc ttg oca acc get tac ttt tac aac gac ctg gag ccc gat 768 
Arg lie Gly Leu Pro Thr Tlla Tyr Phe Tyr Asn Asp Leu Glu Pro Asp 
245 250 255 

gtc gcc tta gca gcc gag acg att ate aga gtt ctg gca cgc aaa gat 816 
Val Ala Leu Ala Ala Glu Thr lie lie Arg Val Leu Ala Arg Lys Asp 
260 265 270 

gtt act ttt gtt gaa gca gat att cct gat tta gcg cat cac aat gaa 864 
Val Thr Phe Val Glu Ala Asp lie Pro Asp Leu Ala His His Asn Glu 
275 280 285 

ggg gtc age ttt ccg act gcc ate tac gaa ttt ccg ttg tec ett gaa 912 
Gly Val Ser Phe Pro Thr Ala lie Tyr Glu Phe Pro Leu Ser Leu Glu 
290 295 300 

cat tat att cag aac ttc gta gag ggt gtt tec ttt tot gag gtt gtc 960 
His Tyr lie Gin Asn Phe Val Glu Gly Val Ser Phe Ser Glu Val Val 
305 310 315 320 

aga gcg att cgc agt ccg gat gtt gca agt att etc aat gca caa etc 1008 
Arg Ala lie Arg Ser Pro Asp Val Ala Ser lie Leu Asn Ala Gin Leu 
325 330 335 

teg gat aat ctt att tec aaa age gag tat tgt ctg gcg cga cgt ttt 1056 
Ser Asp Asn Leu lie Ser Lys Ser Glu Tyr Cys Leu Ala Arg Arg Phe 
340 345 350 

ttc aga ccg aga etc caa gcg gcc tac cac agt tac ttc aag gcg cat 1104 
Phe Arg Pro Arg Leu Gin Ala Ala Tyr His Ser Tyr Phe Lys Ala His 
355 360 365 

cag eta gat gca att ett ttc cca aca get ccg ttg aca gcc aag cca 1152 
Gin Leu Asp Ala lie Leu Phe Pro Thr Ala Pro Leu Thr Ala Lys Pro 
370 375 380 

att ggc cat gat eta teg gtg att cac aat ggc tea atg acc gat acc 1200 
lie Gly His Asp Leu Ser Val lie His Asn Gly Ser Met Thr Asp Thr 
385 390 395 400 

ttt aaa ate ttc gtg egg aat gta gat ccc age agt aat gcg ggc ctg 1248 
Phe Lys lie Phe Val Arg Asn Val Asp Pro Ser Ser Asn Ala Gly Leu 
405 410 415 

ccg ggc eta agt ett ccc gtt tct ctt agt tec aac ggt ctg cct att 12 96 
Pro Gly Leu Ser Leu pro Val Ser Leu Ser Ser Asn Gly Leu Pro lie 
420 425 430 

ggc atg gaa ate gat ggc tct gca age teg gat gaa cgt ctg tta gca 1344 
Gly Met Glu lie Asp Gly Ser Ala Ser Ser Asp Glu Arg Leu Leu Ala 
435 440 445 

att gga eta gcg ata gaa gaa gca ata gac ttt agg eat cgt ccg act 1392 
lie Gly Leu Ala lie Glu Glu Ala lie Asp Phe Arg His Arg Pro Thr 
450 455 460 

ctg teg taa 1401 

Leu Ser 

465 
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<210> 30 
<211> 466 
<212> PRT 

<213> Agrobacterixun rhizogenes 
<400> 30 

Met Val Thr I»eu Ser Ser lie Thr Glu Thr Leu Lys Cys Leu Arg Glu 
1 5 10 15 

Arg Lys Tyr Ser Cys Plie Glu Leu lie Glu Thr He He Ala Arg Cys 

20 25 30 

Glu Ala Ala Arg Ser Leu Asn Ala Phe Leu Glu Thx Asp Trp Ala His 
35 40 45 

Leu Arg Trp Thr Ala Ser Lys He Asp Gin His Gly Gly Ala Gly Val 
50 55 60 

Gly Leu Ala Gly Val Pro Leu Cys Phe Lys Ala Asn He Ala Thr Gly 
65 70 75 80 

Arg Phe Ala Ala Thr Ala Gly Thr Pro Gly Leu Gin Asn His Lys Pro 
85 90 95 

Lys Thr Pro Ala Gly Val Ala Arg Gin Leu Leu Ala Ala Gly Ala Leu 
100 105 110 

Pro Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly He Thr Ser 
1X5 120 125 

Asn .Asn Phe Ala Thr Gly Ala Val Axg Asn Pro Trp Asn Pro Ser Leu 
130 135 140 

He Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Gly Arg 
145 150 155 160 

Leu Met Leu Gly Gly Val Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 

165 170 175 

Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Val Gly Arg 
180 185 190 

Tyr Pro Thr Asp Gly He Val Pro Val Ser Pro Thr Arg Asp Thr Pxo 
195 200 205 

Gly Val He Ala Gin Asn Val Pro Asp Val He Leu Leu Asp Gly He 
210 215 220 

He Cys Gly Arg Pro Pro Val Asn Gin Thr val Arg Leu Lys Gly Leu 
225 230 235 240 

Arg He Gly Leu Pro Thr Ala Tyr Phe Tyr Asn Asp Leu Glu Pro Asp 
245 250 255 

Val Ala Leu Ala Ala Glu Thr Xle He Arg Val Leu Ala Arg Lys Asp 
260 265 270 

Val Thr Phe Val Glu Ala Asp He Pro Asp Leu Ala His His Asn Glu 
275 280 285 

Gly Val Ser Phe Pro Thr Ala He Tyr Glu Phe Pro Leu Ser Leu Glu 
290 295 300 

His Tyr He Gin Asn Phe Val Glu Gly Val Ser Phe Ser Glu Val Val 
305 310 315 320 

Arg Ala He Arg Ser Pro Asp Val Ala Ser He Leu Asn Ala Gin Leu 
325 330 335 



CA 02493364 2005-01-21 



PF 53790 



43 



Ser Asp Asn Leu He Ser Lys Ser Glu Tyr Cys Leu Ala Arg Arg Phe 
340 345 350 

Phe Arg Pro Arg Leu Gin Ala Ala Tyr His Ser Tyr Phe Lys Ala His 
355 360 365 

Gin Leu Asp Ala He Leu Phe Pro Thr Ala Pro Leu Thr Ala Lys Pro 
370 375 380 

He Gly His Asp Leu Ser Val He His Asn Gly Ser Me-b Thr Asp Thr 
385 390 395 400 

Phe Lys He Phe Val Arg Asn Val Asp Pro Sec Ser Asn Ala Gly Leu 
405 410 415 

Pro Gly Leu Ser Leu Pro Val Ser Leu Ser Ser Asn Gly Leu Pro He 
420 425 430 

Gly Met: Glu He Asp Gly Ser Ala Ser Ser Asp Glu Arg Leu Leu Ala 
435 440 445 

He Gly Leu Ala He Glu Glu Ala He Asp Phe Arg His Arg Pro Thr 
450 455 460 

Leu Ser 
465 



<210> 31 
<2ll> 2268 
<212> DNA 

<213> Agrobact^erium tumef aciens 

<220> 

<221> CDS 

<222> (1)..(2265) 

<223> coding for tryptophan monooxygenase 
<400> 31 

atg tea get tea cct etc ctt gat aac cag tgc gat cat ttc tct acc 48 
Met Ser Ala Ser Pro Leu Leu Asp Asn Gin Cys Asp His Phe Ser Thr 
1 5 10 15 

aaa atg gtg gat ctg ata atg gtc gat aag get gat gaa ttg gac cgc 96 
Lys Met Val Asp Leu He Met Val Asp Lys Ala Asp Glu Leu Asp Arg 
20 25 30 

agg gtt tec gat gee ttc tea gaa cgt gaa get tct agg gga agg agg 144 
Arg Val Ser Asp Ala Phe Ser Glu Arg Glu Ala Ser Arg Gly Arg Arg 
35 40 45 

att act caa ate tec ggc gag tgc age get ggg tta get tgc aaa agg 192 
He Thr Gin He Ser Gly Glu Cys Ser Ala Gly Leu Ala Cys Lys Arg 
50 55 60 

ctg gee gac ggt cgc ttt cec gag ate tea act ggt gag aag gta gca 240 
Leu Ala Asp Gly Arg Phe Pro Glu He Ser Thr Gly Glu Lys Val Ala 
65 70 75 80 

gcc etc tec get tac ate tat gtt ggc aag gaa att ctg ggg egg ata 288 
Ala Leu Ser Ala Tyr He Tyr Val Gly Lys Glu He Leu Gly Arg He 
85 90 95 



ctt gaa teg gaa cct tgg gcg cga gca aga gtg agt ggt etc gtt gee 336 
Leu Glu Ser Glu Pro Trp Ala Arg Ala Arg Val Ser Gly Leu Val Ala 
100 105 1X0 
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ate gac ctt gca cca t-tt. tgt atg gat ttc tec gaa gca caa ctt etc 384 
lie Asp Leu Ala Pro Phe Cys Met Asp Phe Ser Glu Ala Gin Leu Leu 
115 120 125 

caa acc ctg ttt ttg ctg age ggt aaa aga tgt gca tec age gat ctt 432 
Gin Thr Leu Phe Leu Leu Ser Gly Lys Arg Cys Ala Ser Ser Asp Leu 
130 135 140 

agt cat ttc gtg gcc att tea ate tct aag act gcc cgc tec cga acc 480 
Ser Bis Phe Val Ala lie Ser lie Ser Lys Thr Ala Arg Ser Arg Thr 
145 150 155 160 

ctg caa atg eeg ccg tac gag aaa ggc acg aeg aaa cgc gtt acc ggg 528 
Leu Gin Met Pro Pro Tyr Glu Lys Gly Thr Thr Lys Arg Val Thr Gly 
165 170 175 

ttt acc ctg ace ctt gaa gag gcc gta cca ttt gac atg gta get tat 576 
Phe Thr Leu Thr Leu Glu Glu Ala Val Pro Phe Asp Met Val Ala Tyr 
180 185 190 

ggt cga aac ctg atg ctg aag get teg gca ggt tec ttt cca aca att 624 
Gly Arg Asn Leu Met Leu Lys Ala Ser Ala Gly Ser Phe Pro Thr lie 
195 200 205 

gac ttg etc tat gac tac aga teg ttt ttt gac caa tgt tec gat att 672 
Asp Leu Leu Tyr Asp Tyr Arg Ser Ph.e Phe Asp Gin Cys Ser Asp lie 
210 215 220 

gga egg ate ggc ttc ttt ccg gaa gat gtt cct aag ccg aaa gtg gcg 720 
Gly Arg lie Gly Phe Phe Pro Glu Asp Val Pro Lys Pro Lys Val Ala 
225 230 235 240 

ate att ggc get ggc att tec gga etc gtg gta gca age gaa ctg ctt 7 68 
lie lie Gly Ala Gly lie Ser Gly I*eu Val Val Ala Ser Glu Leu Leu 
245 250 255 

cat get ggt gta gac gat gtt aca ata tat gaa gca agt gat egg gtt 816 
His Ala Gly Val Asp Asp Val Thr lie Tyr Glu Ala Ser Asp Arg Val 
260 265 270 

gga ggc aag ctt tgg tea cat get ttc aag gat get ccc age gtg gtg 864 
Gly Gly Lys Leu Trp Ser His Ala Phe Lys Asp Ala Pro Ser Val Val 

275 280 285 

gcc gaa atg ggg gcg atg cga ttt cct cct get gca teg tgc ttg ttt 912 
Ala Glu Met Gly Ala Met Arg Phe Pro Pro Ala Ala Ser Cys Leu Phe 

290 295 300 

ttc ttc etc gag egg tac ggc ctg tct teg atg agg ccg ttc cca aat 960 
Phe Phe Leu Glu Arg Tyr Gly Leu Ser Ser Met Arg Pro Phe Pro Asn 
305 310 315 320 

ccc ggc aca gtc gac act aac ttg gtc tac caa ggc etc cga tac gtg 1008 
Pro Gly Thr Val Asp Thr Asn Leu Val Tyr Gin Gly Leu Arg Tyr Val 
325 330 335 

tgg aaa gcc ggg cag cag cca ccg aag ctg ttc cat cgc gtt tac age 1056 
Trp Lys Ala Gly Gin Gin Pro Pro Lys Leu Phe His Arg Val Tyr Ser 

340 345 350 

ggt tgg cgt gcg ttc ttg agg gac ggt ttc cat gag gga gat att gtg 1104 
Gly Trp Arg Ala Phe Leu Arg Asp Gly Phe His Glu Gly Asp lie Val 

355 360 365 

ttg get teg cct gtt gtt att act caa gcc ttg aaa tea gga gac att 1152 
Leu Ala Ser Pro Val Val lie Thr Gin Ala Leu Lys Ser Gly Asp lie 
370 375 380 
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agg egg get cat gac tec tgg caa act tgg ctg aac cgt ttc ggg agg 1200 
Arg Arg Ala His Asp Ser Trp Gin Thr Trp Leu Asn Arg Phe Gly Arg 
385 390 395 400 

gag tec ttc tct tea gcg ata gag agg ate ttt ctg ggc acg cat cct 1248 
Glu Ser Phe Ser Ser Ala Zle Glu Arg lie Phe Leu Gly Thr His Pro 
405 410 415 

cct ggt ggt gaa aca tgg agt ttc cct cat gat tgg gac eta ttc aag 1296 
Pro Gly Gly Glu Thr Trp Ser Phe Pro His Asp Trp Asp Leu Phe Lys 
420 425 430 

eta atg gga ata gga tct ggc ggg ttt ggt cca gtt ttt gaa age ggg 1344 
Leu Met Gly lie Gly Ser Gly Gly Phe Gly Pro Val Phe Glu Ser Gly 
435 440 445 

ttt att gag ate ctt cgc ttg gtc ata aac gga tat gaa gaa aat cag 1392 
Phe lie Glu lie Leu Arg Leu Val lie Asn Gly Tyr Glu Glu Asn Gin 
450 455 460 

egg atg tgc tct gaa gga ate tea gaa ctt cca cgt cga ata gcc tct 1440 
Arg Met Cys Ser Glu Gly lie Ser Glu Leu Pro Arg Arg lie Ala Ser 
465 470 475 480 

caa gtg gtt aac ggt gtg tct gta age cag cgt ata cgc cat gtt caa 1488 
Gin Val Val Asn Gly Val Ser Val Ser Gin Arg lie Arg His Val Gin 
485 490 495 

gtc agg gcg att gag aag gaa aag aca aaa ata aag ata agg ctt aag 1536 
Val Arg Ala He Glu Lys Glu Lys Thr Lys He Lys He Arg Leu Lys 
500 505 510 

age ggg ata tct gaa ctt tat gat aag gtg gtg gtt aca tct gga etc 1584 
Ser Gly He Ser Glu Leu Tyr Asp Lys Val Val Val Thr Ser Gly Leu 
515 520 525 

gca aat ate caa cte agg eat tgt ctg aca tgc gat acc acc att ttt 1632 
Ala Asn He Gin Leu Arg His Cys Leu Thr Cys Asp Thr Thr He Phe 
530 535 540 

cgt gca cca gtg aac caa gcg gtt gat aac age cat atg aca ggc teg 1680 
TVrg Ala Pro Val Asn Gin Ala Val Asp Asn Ser His Met Thr Gly Ser 
545 550 555 560 

tea aaa etc ttt ctg ctg act gaa cga aaa ttt tgg tta gac cat ate 1728 
Ser Lys Leu Phe Leu Leu Thr Glu Arg Lys Phe Trp Leu Asp His He 
565 570 575 

etc ccg tec tgt gtc cte atg gac ggg ate gca aaa gca gtg tac tgc 1776 
Leu Pro ser Cys Val Leu Met Asp Gly He Ala Lys Ala Val Tyr Cys 
580 585 590 

ttg gac tat gag ccg cag gat ccg aat ggt aaa ggt ctg gtg ccc ecc 1824 
Leu Asp Tyr Glu Pro Gin Asp Pro Asn Gly Lys Gly Leu Val Pro Pro 
595 600 605 

act tat aca tgg gag gac gac tec eac aag ctg ttg gcg gtt ccc gac 1872 

Thr Tyr Thr Trp Glu Asp Asp Ser His Lys Leu Leu Ala. Val Pro Asp 

610 615 620 

aaa aaa gag cga ttc tgt ctg ctg egg gac gca att teg aga tct ttc 1920 

Lys Lys Glu Arg Phe Cys Leu Leu Arg Asp Ala He Ser Arg Ser Phe 
625 630 635 640 

ccg gcg ttt gee cag cat eta gtt cct gcc tgc get gat tac gac caa 1968 

Pro Ala Phe Ala Gin His Leu Val Pro Ala Cys Ala Asp Tyr Asp Gin 
645 650 655 
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aat gtt gtt caa cat gat tgg ctt aca gac gag aat gcc ggg gga get 2016 

Ash Val Val Gin His Asp Trp Leu Thr Asp Glu Asn Ala Gly Gly Ala 

660 665 670 

ttc aaa etc aac egg cgt ggc gag gat ttt tat tct gaa gaa ctt tte 2064 
Phe Lys Leu Asn Arg Arg Gly Glu Asp Phe Tyr Ser Glu Glu Leu Phe 
675 680 685 

ttt caa gcg ctg gac atg cct aat gat acc gga gtt tac ttg gcg ggt 2112 
Phe Gin Ala Leu Asp Met Pro Asn Asp Thr Gly Val Tyr Leu Ala Gly 
690 695 700 

tgc agt tgt tec ttc acc ggt gga tgg gtg gag ggc get att cag acc 2160 
Cys Ser Cys Ser Phe Thr Gly Gly Trp Val Glu Gly Ala lie Gin Thr 
705 710 715 720 

gcg tgt aac gcc gtc tgt gca att ate cac aat tgt gga ggt att ttg 2208 
Ala Cys Asn Ala Val Cys Ala lie lie His Asn Cys Gly Gly lie Leu 
725 730 735 

gca aag gac aat cct etc gaa cac tct tgg aag aga tat aac tac cgc 2256 
Ala Lys Asp Asn Pro Leu Glu His Ser Trp Lys Arg Tyr Asn Tyr Arg 
740 745 750 

aat aga aat taa 2268 
Asn Arg Asn 

755 

<210> 32 
<211> 755 
<212> PRT 

<213> Agrobacterium tumefaciens 
<400> 32 

Met Ser Ala Ser Pro Leu Leu Asp Asn Gin Cys Asp His Phe Ser Thr 
15 10 15 

Lys Met Val Asp Leu lie Met Val Asp Lys Ala Asp Glu Leu Asp Arg 
20 25 30 

Arg Val Ser Asp Ala Phe Ser Glu Arg Glu Ala Ser Arg Gly Arg Arg 
35 40 45 

lie Thr Gin lie Ser Gly Glu Cys Ser Ala Gly Leu Ala Cys Lys Arg 
50 55 60 

Leu Ala Asp Gly Arg Phe Pro Glu lie Ser Thr Gly Glu Lys Val Ala 
65 70 75 80 

Ala Leu Ser Ala Tyr lie Tyr Val Gly Lys Glu lie Leu Gly Arg lie 
85 90 95 

Leu Glu Ser Glu Pro Trp Ala Arg Ala Arg Val Ser Gly Leu Val Ala 
100 105 110 

lie Asp Leu Ala Pro Phe Cys Met Asp Phe Ser Glu Ala Gin Leu Leu 
115 120 125 

Gin Thr Leu Phe Leu Leu Ser Gly Lys Arg Cys Ala Ser Ser Asp Leu 
130 135 140 

Ser His Phe Val Ala lie Ser Xle Ser Lys Thr Ala Arg Ser Arg Thr 
145 150 155 160 

Leu Gin Met Pro Pro Tyr Glu Lys Gly Thr Thr Lys Arg Val Thr Gly 
165 170 175 
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Phe Thr Leu Thr Leu Glu Glu Ala Val Pro Phe Asp Met Val Ala Tyr 
180 185 190 

Gly Arg Asn Leu Met Leu Lys Ala Ser Ala Gly Ser Phe Pro Thr lie 
195 200 205 

Asp Leu Leu Tyr Asp Tyr Arg Ser Phe Phe Asp Gin Cys Ser Asp lie 
210 215 220 

Gly Arg lie Gly Phe Phe pro Glu Asp Val Pro Lys Pro Lys Val Ala 
225 230 235 240 

lie lie Gly Ala Gly lie Ser Gly Leu Val Val Ala Ser Glu Leu Leu 
245 250 255 

His Ala Gly Val Asp Asp Val Thr lie Tyr Glu Ala Ser Asp Arg Val 
260 265 270 

Gly Gly Lys Leu Trp Ser His Ala Fhe Lys Asp Ala Pro Ser Val Val 
275 280 2B5 

Ala Glu Met Gly Ala Met Arg Phe Pro Pro Ala Ala Ser Cys Leu Phe 
290 295 300 

Phe Phe Leu Glu Arg Tyr Gly Leu Ser Ser Met Arg Pro Phe Pro Asn 
305 310 315 320 

Pro Gly Thr Val Asp Thr Asn Leu Val Tyr Gin Gly Leu Arg Tyr Val 
325 330 335 

Trp Lys Ala Gly Gin Gin Pro Fro Lys Leu Phe His Arg Val Tyr Sex: 
340 345 350 

Gly Trp Arg Ala Phe Leu Arg Asp Gly Phe His Glu Gly Asp lie Val 
355 360 365 

Leu Ala Ser Pro Val Val lie Thr Gin Ala Leu Lys Ser Gly Asp Tie 
370 375 380 

Arg Arg Ala His Asp Ser Trp Gin Thr Trp Leu Asn Arg Phe Gly Arg 
385 390 395 400 

Glu Ser Phe Ser Ser Ala Xle Glu Arg tie Fhe Leu Gly Thr His Pro 
405 410 415 

Pro Gly Gly Glu Thr Trp Ser Phe Pro His Asp Trp Asp Leu Fhe Lys 
420 425 430 

Leu Met Gly lie Gly Ser Gly Gly Fhe Gly Pro Val Fhe Glu Ser Gly 
435 440 445 

Phe lie Glu lie Leu Arg Leu Val lie Asn Gly Tyr Glu Glu Asn Gin 
450 455 460 

Arg Met Cys Ser Glu Gly lie Ser Glu Leu Pro Arg Arg He Ala Ser 
465 470 475 480 

Gin Val Val Asn Gly Val Ser Val Ser Gin Arg Xle Arg Bis Val Gin 
485 490 495 

Val Arg Ala He Glu Lys Glu Lys Thr Lys He Lys Xle Arg Leu Lys 
500 505 510 

Ser Gly lie Ser Glu Leu Tyr Asp Lys Val Val Val Thr Ser Gly Leu 
515 520 525 

Ala Asn Xle Gin Leu Arg His Cys Leu Thr Cys Asp Thr Thr Xle Phe 
530 535 540 
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Arg AXa Pro Val Aszi Gin Ala Val Asp Asn Ser His Met. Thr Gly Ser 
545 550 555 560 

Ser Lys I»eu Phe Leu I*eu Thr Glu Arg Lys Phe Trp I^u Asp His lie 
565 570 575 

Leu Pro Ser Cys Val Leu Met Asp Gly lie Ala Lys Ala Val Tyr Cys 
580 585 590 

Leu Asp Tyr Glu Pro Gin Asp Pro Asn Gly Lys Gly Leu Val Pro Pro 
595 600 605 

Thr Tyr Thr Trp Glu Asp Asp Ser His Lys Leu Leu Ala Val Pro Asp 
610 615 620 

Lys Lys Glu Arg Phe Cys Leu Leu Arg Asp Ala lie Ser Arg Ser Phe 
625 630 635 640 

Pro Ala Phe Ala Gin His Leu Val Pro Ala Cys Ala Asp Tyr Asp Gin 
645 650 655 

Asn Val Val Gin His Asp Trp Leu Thr Asp Glu Asn Ala Gly Gly Ala 
660 665 670 

Phe Lys Leu Asn Arg Arg Gly Glu Asp Phe Tyr Ser Glu Glu Leu Phe 
675 680 685 

Phe Gin Ala Leu Asp Met Pro Asn Asp Thr Gly Val Tyr Leu Ala Gly 
690 695 700 

Cys Ser Cys Ser Phe Thr Gly Gly Trp Val Glu Gly Ala lie Gin Thr 
705 710 715 720 

Ala Cys Asn Ala Val Cys Ala lie lie His Asn Cys Gly Gly lie Leu 
725 730 735 

Ala Lys Asp Asn Pro Leu Glu His Ser Trp Lys Arg Tyr Asn Tyr Arg 
740 745 750 

Asn Arg Asn 
755 



<210> 33 
<211> 1404 
<212> DNA 

<213> Agrobacterium tumefaciens 

<220> 

<221> CDS 

<222> (1)..(1401) 

<223> coding for indoleacetamlde hydrolase 

<400> 33 

atg gtg ccc att acc teg tta gca caa acc eta gaa cgc ctg aga egg 4 8 
Met Val Pro lie Thr Ser Leu Ala Gin Thr Leu Glu Arg Leu Arg Arg 
15 10 .15 

aaa gae tac tec tgc tta gaa eta gta gaa act ctg ata geg cgt tgc 96 
Lys Asp Tyr Ser Cys Leu Glu Leu Val Glu Thr Leu lie Ala Arg Cys 
20 25 30 

caa get gca aaa cca tta aat gcc ctt ctg get aca gae tgg gat ggc 144 
Gin Ala Ala Lys Pro Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Gly 
35 40 45 
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ttg egg cga age gee aaa aaa aat gat egt cat gga aac gcc gga tta 192 
Leu Arg Arg Ser Ala Lys Lys Asn Asp Arg His Gly Asn Ala Gly Leu 
50 55 60 

ggt ctt tgc gge att oca etc tgt ttt aag gcg aac ate gcg ace ggc 240 
Gly Leu Cys Gly lie Pro Leu Cys Phe Lys Ala Asn lie Ala Thr Gly 
65 70 75 80 

gta ttt cet aea age get get act ccg gcg ctg ata aac cac ttg cca 288 
Val Phe Pro Thr Ser Ala Ala Thr Pro Ala Leu lie Asn His Leu Pro 
85 90 95 

aag ata cca tec cgc gtc gca gaa. aga ctt ttt tea get gga gca ctg 336 
Lys lie Pro Ser Arg Val Ala Glu Arg Leu Phe Ser Ala Gly Ala Leu 
100 105 110 

ccg ggt gcc teg gga aac atg cat gag tta teg ttt gga att aeg age 384 
Pro Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly lie Thr Ser 
115 120 125 

aac aac tat gee ace ggt gcg gtg egg aac ccg tgg aat cca agt ctg 432 
Asn Asn Tyr Ala Thr Gly Ala Val Arg Asn Pro Trp Asn Pro Ser Leu 
130 135 140 

ata cca ggg ggt tea age ggt ggt gtg get get gcg gtg gca age cga 480 
lie Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Ser Arg 
145 150 155 160 

ttg atg tta ggc ggc ata ggc aeg gat aec ggt gca tct gtt cgc eta 528 
Leu Met Leu Gly Gly lie Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 
165 170 175 

ccg gca gee ctg tgt ggc gta gta gga ttt cga ccg aeg ctt ggt cga 576 
Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Leu Gly Arg 
180 185 190 

tat cca aga gat egg ata ata ccg ttc age ccc acc egg gac acc gcc 624 
Tyr Pro Arg Asp Arg lie lie Pro Phe Ser Pro Thr Arg Asp Thr Ala 
195 200 205 

gga ate ata gcg cag tgc gta gee gat gtt ata ate etc gac cag gtg 672 
Gly lie lie Ala Gin Cys Val Ala Asp Val He He Leu Asp Gin Val 
210 215 220 

att tec gga egg teg gcg aaa att tea ccc atg ccg ctg aag ggg ctt 720 
He Ser Gly Arg Ser Ala Lys He Ser Pro Met Pro Leu Lys Gly Leu 
225 230 235 240 

egg ate ggc etc ccc act acc tac ttt tac gat gac ctt gat get gat 768 
Arg He Gly Leu Pro Thr Thr Tyr Phe Tyr Asp Asp Leu Asp Ala Asp 
245 250 255 

gtg gee ttc gca get gaa aeg aeg att cgc ttg eta gcc aac aga ggc 816 
Val Ala Phe Ala Ala Glu Thr Thr He Arg Leu Leu Ala Asn Arg Gly 
260 265 270 

gta acc ttt gtt gaa gcc gac ate ccc cac eta gag gaa ttg aac agt 864 
Val Thr Phe Val Glu Ala Asp He Pro Bis Leu Glu Glu Leu Asn Ser 
275 280 285 

ggg gca agt ttg cca att gcg ctt tac gaa ttt cca cac get eta aaa 912 
Gly Ala Ser Leu Pro He Ala Leu Tyr Glu Phe Pro His Ala Leu Lys 
290 295 300 

aag tat etc gac gat ttt gtg gga aea gtt tct ttt tct gac gtt ate 960 
Lys Tyr Leu Asp Asp Phe Val Gly Thr Val Ser Phe Ser Asp Val He 
305 310 315 320 
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aaa gga att cgt age ccc gat gta gcg aac att gtc agt gcg caa att 1008 
Lys Gly lie Arg Ser Pro Asp Val Ala Asn He Val Ser Ala Gin He 
325 330 335 

gat ggg cat caa att tec aac gat gaa tat gaa ctg gcg cgt caa tec 1056 
Asp Gly His Gin He Ser Asn Asp Glu Tyr Glu Leu Ala Arg Gin Ser 
340 345 350 

ttc agg cca agg etc cag gcc act tat egg aat tac ttc aga etc tat 1104 
Phe Arg Pro Arg Leu Gin Ala Thr Tyr Arg Asn Tyr Phe Arg Leu Tyr 
355 360 365 

cag tta gat gca ate ctt ttc eca act gca ccc tta gcg gcc aaa gcc 1152 
Gin Leu Asp Ala He Leu Phe Pro Thr Ala Pro Leu Ala Ala Lys Ala 
370 375 380 

ata ggt cag gag teg tea gtc ate cac aat ggc tea atg atg aac act 1200 
He Gly Gin Glu Ser Ser Val He Hie Asn Gly Ser Met Met Asn Thr 
385 390 395 400 

ttc aag ate tac gtg cga aat gtg gac cca age age aac gca ggc eta 1248 
Phe Lys He Tyr Val Arg Asn Val Asp Pro Ser Ser Asn Ala Gly Leu 
405 410 415 

cct .ggg ttg age ctt cet gcc tgc ctt aca cct gat cgc ttg cct gtt 1296 
Pro Gly Leu Ser Leu Pro Ala Cys Leu Thr Pro Asp Arg I»eu Pro Val 
420 425 430 

gga atg gaa att gat gga tta gcg ggg tea gac cac cgt ctg tta gca 1344 
Gly Met Glu He Asp Gly Leu Ala Gly ser Asp Bis Arg Leu Leu Ala 
435 440 445 

ate ggg gca gca tta gaa aaa get ata aat ttt tct tec ttt ccc gat 1392 
He Gly Ala Ala Leu Glu Lys Ala He Asn Phe Ser Ser Phe Pro Asp 
450 455 460 

get ttt aat tag 1404 

Ala Phe Asn 

465 

<210> 34 
<211> 467 
<212> PRT 

<213> Agrobacterlum tumefaclens 
<400> 34 

Met Val Fro He Thr Ser Leu Ala Gin Thr Leu Glu Arg Leu Arg Arg 
1 5 10 15 

Lys Asp Tyr Ser Cys Leu Glu Leu Val Glu Thr Leu He Ala Arg Cys 
20 25 30 

Gin Ala Ala Lys Pro Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Gly 
35 40 45 

Leu Arg Arg Ser Ala Lys Lys Asn Asp Arg His Gly Asn Ala Gly Leu 
50 55 60 

Gly Leu Cys Gly He Pro Leu Cys Phe Lys Ala Asn He Ala Thr Gly 
65 70 75 80 

Val Phe Pro Thr Ser Ala Ala Thr Pro Ala Leu He Asn His Leu Pro 
85 90 95 

Lys He Pro Ser Arg Val Ala Glu Arg Leu Phe Ser Ala Gly Ala Leu 
100 105 110 
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Pro Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly lie Thr Ser 
115 120 125 

Asn Asn Tyr Ala Thr Gly Ala Val Arg Asn Pro Trp Asn Pro Ser Leu 
130 135 140 

lie Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Ser Arg 
145 150 155 160 

Leu Met Leu Gly Gly lie Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 

165 170 175 

Pro Ala Ala Leu Cys Gly Val Val Gly Phe Arg Pro Thr Leu Gly Arg 
180 1B5 190 

Tyr Pro Arg Asp Arg lie lie Pro Phe Ser Pro Thr Arg Asp Thr Ala 
195 200 205 

Gly lie lie Ala Gin Cys Val Ala Asp Val lie lie I>eu Asp Gin Val 
210 215 220 

lie Ser Gly Arg Ser Ala Lys He Ser Pro Met Pro Leu Lys Gly Leu 
225 230 235 240 

Arg He Gly Leu Pro Thr Thr Tyr Fhie Tyr Asp Asp Leu Asp Ala Asp 
245 250 255 

VaX Ala Phe Ala Ala Glu Thr Thr He Arg Leu Leu Ala Asn Arg Gly 

260 265 270 

Val Thr Phe Val Glu Ala Asp Xle Pro His Leu Glu Glu Leu Asn Ser 
275 280 285 

Gly Ala Ser Leu Pro He Ala Leu Tyr Glu Phe Pro Bis Ala Leu Lys 
290 295 300 

Lys Tyr Leu Asp Asp Phe Val Gly Thr Val Ser Phe Ser Asp Val He 
305 310 315 320 

Lys Gly He Arg Ser Pro Asp Val Ala Asn He Val Ser Ala Gin He 
325 330 335 

Asp Gly His Gin He Ser Asn Asp Glu Tyr Glu Leu Ala Arg Gin Ser 
340 345 350 

Phe Arg Pro Arg Leu Gin Ala Thr Tyr Arg Asn Tyr Phe Arg Leu Tyr 
355 360 365 

Gin Leu Asp Ala He Leu Phe Pro Thr Ala Fro Leu Ala Ala Lys Ala 
370 375 380 

Xle Gly Gin Glu Ser Ser Val He His Asn Gly Ser Met Met Asn Thr 
385 390 395 400 

Phe Lys He Tyr Val Arg Asn Val Asp Pro Ser Ser Asn Ala Gly Leu 
405 410 415 

Pro Gly Leu Ser Leu Pro Ala Cys Leu Thr Pro Asp Arg Leu Pro Val 
420 425 430 

Gly Met Glu He Asp Gly Leu Ala Gly Ser Asp His Arg Leu Leu Ala 
435 440 445 

He Gly Ala Ala Leu Glu Lys Ala He Asn Phe Ser Ser Phe Pro Asp 
450 455 460 

Ala Phe Asn 
465 
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<210> 35 
<211> 1419 
<212> DNA 

<213> Agrobacteriuin vit:is 

<220> 

<221> CDS 

<222> (1)..(1416) 

<223> coding for indoleacetaxnlde hydrolase 
<400> 35 

at:g gt.g acc cba ggt t-ca a-bc aag gaa acc ctg gaa -kg'k c^c agg cbg 48 
Met Val Thr Leu Gly Ser lie Lys GXu Thr X^eu GIu Cys Leu Arg teu 
15 10 15 

aaa aaa tac tec tgt tec gaa ctg get gaa acc ata ata gcc cgt tgc 96 
Lys Lys Tyr Ser Cys Ser Glu Leu Ala Glu Thr He He Ala Arg Cys 
20 25 30 

gaa gcc gcg aaa tct etc aat get ctt ctg gcg act gac tgg gat tac 144 
Glu Ala Ala Lys Ser Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Tyr 
35 40 45 

ctg egg cgt aat gcc aag aaa gta gat gaa gat gga age gcc ggc gag 192 
Leu Arg Arg Asn Ala Lys Lys Val Asp Glu Asp Gly Ser Ala Gly Glu 
50 55 60 

ggt ctt gcc ggc ate ccg ctg tgt tct aaa gcg aac att gca aca ggc 240 
Gly Leu Ala Gly lie Pro Leu Cys Ser Lys Ala Asn Zle Ala Thr Gly 
65 70 75 80 

ata ttc cca gca age gcg gcc acg ccg gcg ctt gat gaa cat tta cct 288 
He Fhe Pro Ala Ser Ala Ala Thr Pro Ala Leu Asp Glu His Leu Pro 
85 90 95 

aca aca cca gcc ggc gtc cgt aaa ccg ctt eta gac get ggg gca ctg 33 6 
Thr Thr Pro Ala Gly Val Arg Lys Pro Leu Leu Asp Ala Gly Ala Leu 
100 105 110 

ata ggc get teg gga aac atg cat gag tta teg ttt ggc att acc agt 384 
Xle Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly He Thr Ser 
115 120 125 

aac aac cac gcc act ggt gcg gtg aga aac ccc tgg aat ccc age tta 432 
Asn Asn His Ala Thr Gly Ala Val Arg Asn Pro Trp Asn Pro Ser Leu 
130 135 140 

ata cca gga ggc teg age ggc ggc gtg get get get gta gca tea egg 480 
He Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Ser Arg 
145 150 155 160 

tta atg etc ggc gga att ggc acc gac acg ggg get teg gtc cgc eta 528 
Leu Met Leu Gly Gly He Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 
165 170 175 

cct gca tec eta tgt ggc gta gtg gga ttc cgc ccg acg ate ggc aga 576 
Pro Ala Ser Leu Cys Gly Val Val Gly Phe Arg Pro Thr He Gly Arg 
160 185 190 

tat cct gga gac cga att gtg ccg gtt age ccc acc cgc gat aca gcc 624 
Tyr Pro Gly Asp Arg He Val Pro Val Ser Pro Thr Arg Asp Thr Ala 
195 200 205 
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gga att. ate gca cag age gtt cct. gat gtg ata etc ctt gac caa ate 672 
Gly lie He Ala Gin Ser Val Fro Asp Val He Leu Leu Asp Gin He 
210 215 220 

att tgc ggg aag etc acg ace cae caa cct gta ccc etg gag gga tta 720 
He Cys Gly Lys Leu Thr Thr His Gin Pro Val Pro Leu Glu Gly Leu 
225 230 235 240 

cgt ate ggc ttg cea ace act tac ttt tac gat gac ctt gat get gat 768 
Arg He Gly Leu Pro Thr Thr Tyr Phe Tyr Asp Asp Leu Asp Ala Asp 

245 250 255 

gtg gcc ttc gca get gaa aac ctt ate acg etg etg gee age aag ggt 816 
Val Ala Phe Ala Ala Glu Asn Leu He Thr Leu Leu Ala Ser Lys Gly 
260 265 270 

gta ace ttt gtt aag gcc gag att cca gat etg cag cgt etg aac ate 864 
Val Thr Phe Val Lys Ala Glu He Pro Asp Leu Gin Arg Leu Asn He 
275 280 285 

ggg gtt age ttt cet att gee etg tac gag ttt ccg ttc gcc eta caa 912 
Gly Val Ser Phe Pro He Ala Leu Tyr Glu Phe Pro Phe Ala Leu Gin 
290 295 300 

aag tat ate gat gac ttt gtg aag gat gtg tct ttt tct gac gtc ate 960 
Lys Tyr He Asp Asp Phe Val Lys Asp Val Ser Phe Ser Asp Val He 
305 310 315 320 

aaa gga att cgt age cct gat gta gcc aac att gcc aat get caa att 1008 
Lys Gly He Arg Ser Pro Asp Val Ala Asn He Ala Asn Ala Gin He 
325 330 335 

gat gga cat caa att tee aaa get tea tat gaa etg gcg ega caa tct 1056 
Asp Gly His Gin He Ser Lys Ala Ser Tyr Glu Leu Ala Arg Gin Ser 
340 345 350 

ttc aga cca aag etg caa gcc gcc tac cat gat tac ttc aag etg cae 1104 
Phe Arg Pro Lys Leu Gin Ala Ala Tyr His Asp Tyr Phe Lys Leu His 
355 360 365 

cag eta gac gcg ate ctt ttc ccg aca get ccc etg aca gcc aaa ccg 1152 
Gin Leu Asp Ala He Leu Phe Pro Thr Ala Pro Leu Thr Ala Lys Pro 
370 375 380 

ate ggc caa gat tta teg gtg atg cae aat gge gta atg gee gac acg 1200 
He Gly Gin Asp Leu Ser Val Met Bis Asn Gly Val Met Ala Asp Thr 
385 390 395 400 

ttt aaa ate ttc gtg ega aat gtg gat ccg ggg age aac gca ggc etg 1248 
Phe Lys He Phe Val Arg Asn Val Asp Pro Gly Ser Asn Ala Gly Leu 
405 410 415 

cca gga tta age ctt ccc gtt tct ctt act tea aag ggt ttg cct att 1296 
Pro Gly Leu Ser Leu Pro Val Ser Leu Thr Ser Lys Gly Leu Pro He 
420 425 430 

gga atg gaa ate gat gga tta gcg gge atg gac gac cgt ttg eta gca 1344 
Gly Met Glu He Asp Gly Leu Ala Gly Met Asp Asp Arg Leu Leu Ala 
435 440 445 

ate gga gcg gca eta gag gaa gcg ata get ttt cat aat tta cct gac 1392 
He Gly Ala Ala Leu Glu Glu Ala He Ala Phe His Asn Leu Pro Asp 
450 455 460 

ttc ccg aaa gtc gag aca aac tac tga 1419 
Phe Pro Lys Val Glu Thr Asn Tyr 
465 470 
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<210> 36 
<211> 472 
<212> PRT 

<213> Agrobacterium vi-his 
<400> 36 

Met Val Thr I*eu Gly Ser lie Lys Glu Thr l*eu Glu Cys Leu Arg Leu 
15 10 15 

Lys Lys Tyr Ser Cys Ser Glu Leu Ala Glu Thr lie lie Ala Arg Cys 
20 25 30 

Glu Ala Ala Lys Ser Leu Asn Ala Leu Leu Ala Thr Asp Trp Asp Tyr 
35 40 45 

Leu Arg Arg Asn Ala Lys Lys Val Asp Glu Asp Gly Ser Ala Gly Glu 
50 55 60 

Gly Leu Ala Gly lie Pro Leu Cys Ser Lys Ala Asn lie Ala Thr Gly 
65 70 75 80 

lie Phe Pro Ala Ser Ala Ala Thr Pro Ala Leu Asp Glu His Leu Pro 
85 90 95 

Thr Thr Pro Ala Gly Val Arg Lys Pro Leu Leu Asp Ala Gly Ala Leu 
100 105 110 

lie Gly Ala Ser Gly Asn Met His Glu Leu Ser Phe Gly lie Thr Ser 
115 120 125 

Asn Asn His Ala Thr Gly Ala val Arg Asn Pro Trp Asn Pro Ser Leu 
130 135 140 

lie Pro Gly Gly Ser Ser Gly Gly Val Ala Ala Ala Val Ala Ser Arg 
145 150 155 160 

Leu Met. I^u Gly Gly lie Gly Thr Asp Thr Gly Ala Ser Val Arg Leu 
165 170 175 

Pro Ala Ser Leu Cys Gly Val Val Gly Phe Arg Pro Thr lie Gly Arg 
180 185 190 

Tyr Pro Gly Asp Arg lie Val Pro Val Ser Pro Thr Arg Asp Thr Ala 
195 200 205 

Gly lie lie Ala Gin Ser Val Fro Asp Val He Leu Leu Asp Gin He 
210 215 220 

He Cys Gly Lys Leu Thr Thr His Gin Pro Val Pro Leu Glu Gly Leu 
225 230 235 240 

Arg He Gly Leu Pro Thr Thr Tyr Phe Tyr Asp Asp Leu Asp Ala Asp 
245 250 255 

Val Ala Phe Ala Ala Glu Asn Leu He Thr Leu Leu Ala Ser Lys Gly 

260 265 270 

Val Thr Phe Val Lys Ala Glu He Pro Asp Leu Gin Arg Leu Asn He 
275 280 285 

Gly Val Ser Phe Pro He Ala I*eu Tyr Glu Phe Pro Phe Ala Leu Gin 
290 295 300 

Lys Tyr He Asp Asp Phe Val Lys Asp Val Ser Phe Ser Asp Val He 
305 310 315 320 

Lys Gly He Arg Ser Pro Asp Val Ala Asn He Ala Asn Ala Gin He 
325 330 335 
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Asp Gly His Gin lie Ser Lys Ala Ser Tyr Glu Leu Ala Arg Gin* Ser 
340 345 350 

Phe Arg Pro Lys Leu Gin Ala Ala Tyr His Asp Tyr Phe Lys Leia His 
355 360 365 

Gin Leu Asp Ala lie Leu Phe Pro Thr Ala Pro Leu Thr Ala Lys Pro 
370 375 380 

lie Gly Gin Asp Leu Ser Val Met His Asn Gly Val Met Ala Asp Thr 
385 390 395 400 

Phe Lys lie Phe Val Arg Asn Val Asp Pro Gly Ser Asn Ala Gly Leu 
405 410 415 

Pro Gly Leu Ser Leu Pro Val Ser Leu Thr Ser Lys Gly Leu Pro lie 
420 425 430 

Gly Met Glu lie Asp Gly Leu Ala Gly Met Asp Asp Arg Leu Leu Ala 
435 440 445 

Xle Gly Ala Ala Leu Glu Glu Ala lie Ala Phe His Asn Leu Pro Asp 
450 455 460 

Phe Pro Lys Val Glu Thr Asn Tyr 
465 470 

<210> 37 
<211> 1263 
<212> DNA 

<213> Arabidopsis thaliana 

<220> 

<221> CDS 

<222> (1)..(1260) 

<223> coding for 5 -methyl thioribose kinase 
<4 00> 37 

atg tct ttt gag gag ttt acg ccg tta aac gag aag tct ctt gta gac 48 
Met Ser Phe Glu Glu Phe Thr Pro Leu Asn Glu Lys Ser Leu Val Asp 
15 10 15 

tac ate aag tea aca cot get etc tct tec aag ate gga gee gae aag 96 
Tyr He Lys Ser Thr Pro Ala Leu Ser Ser Lys He Gly Ala Asp Lys 
20 25 30 

tec gat gat gat ttg gtt ate aaa gaa gtt gga gat ggc aat etc aat 144 
Ser Asp Asp Asp Leu Val He Lys Glu Val Gly Asp Gly Asn Leu Asn 
35 40 45 

tte gtt ttc ate gtt gtt gga tec tct ggt tct ctt gtc ate aaa cag 192 
Phe Val Phe He Val Val Gly Ser Ser Gly Ser Leu Val He Lys Gin 
50 55 60 

get ctt cca tat att cgc tgt ate ggt gaa tea tgg eca atg acg aaa 240 
Ala Leu pro Tyr He Arg Cys He Gly Glu Ser Trp Pro Met Thr Lys 
65 70 75 80 

gaa aga get tat ttt gaa gca aca act ttg aga aag cat gga aat tta 288 
Glu Arg Ala Tyr Phe Glu Ala Thr Thr Leu Arg Lys His Gly Asn Leu 
85 90 95 

tea cct gat cat gtt cct gaa gtc tac cat ttt gac aga aca atg gcg 336 
Ser Pro Asp His Val Pro Glu Val Tyr His Phe Asp Arg Thr Met Ala 
100 105 110 
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ttg att gga a-tg aga tac ctt, gag cct cct cat ate att etc cgc aaa 384 
Leu lie Gly Met Arg Tyr Leu Glu Pro Pro His lie lie Leu Arg Lys 
115 120 125 

gga etc att get ggg att gag tat cct ttc etc gca gac cac atg tct 432 
Gly Leu lie Ala Gly He Glu Tyr Pro Phe Leu Ala Asp His Met Ser 
130 135 140 

gat tac atg gcg aag act etc ttc ttc act tct etc etc tat cac gat . 4 80 
Asp Tyr Met Ala Lys Thr Leu Phe Phe Thr Ser Leu Leu Tyr His Asp 
145 150 155 160 

ace aca gag cac aga aga gca gta acc gaa ttt tgt ggt aat gtg gag 528 
Thr Thr Glu His Arg Arg Ala Val Thr Glu Phe Cys Gly Asn Val Glu 
165 170 175 

tta tgc cga tta acg gag caa gtt gtg ttt teg gac cca tat aga gtt 576 
Leu Cys Arg Leu Thr Glu Gin Val Val Phe Ser Asp Pro Tyr Arg Val 
180 185 190 

tec aca ttt aat cgt tgg act tea cct tat ctt gat gat gat get aag 624 
Ser Thr Phe Asn Arg Trp Thr Ser Pro Tyr Leu Asp Asp Asp Ala Lys 
195 200 205 

get gtg cgc gaa gac agt gcc ttg aag etc gaa ate gca gag eta aaa 672 
Ala Val Arg Glu Asp Ser Ala Leu Lys Leu Glu He Ala Glu Leu Lys 
210 215 220 

teg atg ttc tgt gaa aga get caa get tta ata cat ggt gat ctt cat 720 
Ser Met Phe Cys Glu Arg Ala Gin Ala Leu He His Gly Asp Leu His 
225 230 235 240 

act ggt tct gtc atg gtt act caa gat tea aeg caa gtt ata gat cca 768 
Thr Gly Ser Val Met Val Thr Gin Asp Ser Thr Gin Val He Asp Pro 
245 250 255 

gag ttt teg ttc tat gga ceg atg ggt ttc gat att gge get tat ctt 816 
Glu Phe Ser Phe Tyr Gly Pro Met Gly Phe Asp He Gly Ala Tyr Leu 
260 265 270 

ggt aac ttg ata eta get ttc ttt gca caa gat gga cac gee act cag 864 
Gly Asn Leu He Leu Ala Phe Phe Ala Gin Asp Gly His Ala Thr Gin 

275 280 285 

gaa aat gat cga aaa gaa tac aag cag tgg ate ttg aga acc att gag 912 
Glu Asn Asp Arg Lys Glu Tyr Lys Gin Trp He Leu Arg Thr He Glu 
290 295 300 

caa act tgg aat ttg ttt aac aaa agg ttc att gcg eta tgg gat caa 960 
Gin Thr Trp Asn Leu Phe Asn Lys Arg Phe He Ala Leu Trp Asp Gin 
305 310 315 320 

aac aaa gat gga cca ggc gaa gca tac ctt gca gat ate tat aac aat 1008 
Asn Lys Asp Gly Pro Gly Glu Ala Tyr Leu Ala Asp He Tyr Asn Asn 
325 330 335 

acc gag gtt ttg aag ttt gtt caa gaa aac tac atg agg aat ttg ttg 1056 
Thr Glu Val Leu Lys Phe Val Gin Glu Asn Tyr Met Arg Asn Leu Leu 
340 345 . 350 

cat gac tea etc gga ttc ggc get gca aag atg att agg aga att gtg 1104 
His Asp Ser Leu Gly Phe Gly Ala Ala Lys Met He Arg Arg He Val 
355 360 365 

gga gtg gca cat gtt gag gac ttt gaa tea ate gaa gaa gat aag cga 1152 
Gly Val Ala His Val Glu Asp Phe Glu Ser He Glu Glu Asp Lys Arg 
370 375 380 
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aga get, a^-t tgc gag aga ag^ gca etc 
Arg Ala lie Cys Glu Arg Ser Ala I^eu 
385 390 

aag gaa agg aga aag ttt aag ag't ate 
Lys Glu Arg Arg Lys Phe Lys Ser lie 
405 

caa caa caa age taa 
Gin Gin Gin Ser 
420 



57 

gag ttt gcg aag atg ctt etc 1200 
Glu Phe Ala Lys Met Leu Leu 
395 400 

ggt gaa gtt gtt tea gca att 1248 
Gly Glu Val Val Ser Ala lie 
410 415 

1263 



<210> 38 
<211> 420 
<212> PRT 

<213> Arabidopsis thallana 
<400> 38 

Met Ser Phe Glu Glu Phe Thr Pro Leu Asn Glu Lys Ser Leu Val Asp 
15 10 15 

Tyr tie Lys Ser Thr Pro Ala Leu Ser Ser Lys lie Gly Ala Asp Lys 
20 25 30 

ser Asp Asp Asp Leu Val Xle Lys Glu Val Gly Asp Gly Asn Leu Asn 
35 40 45 

Phe Val Phe lie Val Val Gly Ser Ser Gly Ser Leu Val lie Lys Gin 
50 55 60 

Ala Leu Pro Tyr He Arg Cys He Gly Glu Ser Trp Pro Met Thr Lys 
65 70 75 80 

Glu Arg Ala Tyr Phe Glu Ala Thr Thr Leu Arg Lys His Gly Asn Leu 
85 90 95 

Ser Pro Asp His Val Pro Glu Val Tyr His Phe Asp Arg Thr Met Ala 
100 105 110 

Leu He Gly Met Arg Tyr Leu Glu Pro Pro His He Xle Leu Arg Lys 
115 120 125 

Gly Leu He Ala Gly He Glu Tyr Pro Phe Leu Ala Asp His Met Ser 
130 135 140 

Asp Tyr Met Ala Lys Thr Leu Phe Phe Thr Ser I*eu Leu Tyr His Asp 
145 ISO 155 160 

Thr Thr Glu His Arg Arg Ala Val Thr Glu Phe Cys Gly Asn Val Glu 
165 170 175 

Leu Cys Arg Leu Thr Glu Gin Val Val Phe Ser Asp Pro Tyr Arg Val 
180 185 190 

Ser Thr Phe Asn Arg Trp Thr Ser Pro Tyr Leu Asp Asp Asp Ala Lys 
195 200 205 

Ala Val Arg Glu Asp Ser Ala X«eu Lys Leu Glu He Ala Glu Leu Lys 
210 215 220 

Ser Met Phe Cys Glu Arg Ala Gin Ala Leu He His Gly Asp Leu His 
225 230 235 240 

Thr Gly Ser Val Met Val Thr Gin Asp Ser Thr Gin Val He Asp Pro 
245 250 255 
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Glu Phe Ser Phe Tyr Gly Pro Met Gly Phe Asp He Gly Ala Tyr Leu 
260 265 270 

Gly Ash Leu He Leu Ala Phe Phe Ala Gin Asp Gly His Ala Thr Gin 
275 280 285 

Glu Asn Asp Arg Lys Glu Tyr Lys Gin Trp He Leu Arg Thr He Glu 
290 295 300 

Gin Thr Trp Asn Leu Phe Asn Lys Arg Phe He Ala Leu Trp Asp Gin 

305 310 315 320 

Asn Lys Asp Gly Pro Gly Glu Ala Tyr Leu Ala Asp He Tyr Asn Asn 
325 330 335 

Thr Glu Val Leu Lys Phe Val Gin Glu Asn Tyr Met. Arg Asn Leu Leu 
340 345 350 

His Asp Ser Leu Gly Phe Gly Ala Ala Lys Met He Arg Arg He Val 
355 360 365 

Gly Val Ala His Val Glu Asp Phe Glu Ser He Glu Glu Asp Lys Arg 
370 375 380 

Arg Ala He Cys Glu Arg Ser Ala Leu Glu Phe Ala Lys Met Leu Leu 
385 390 395 400 

Lys Glu Arg Arg Lys Phe Lys Ser He Gly Glu Val Val Ser Ala He 
405 410 415 

Gin Gin Gin Ser 
420 



<210> 39 
<211> 1200 
<212> DNA 

<213> Klebsiella pneumoniae 

<220> 

<22l> CDS 

<222> (1>..(1197) 

<223> coding for 5 -methyl thioril>ose kinase 
<400> 39 

atg teg caa tac cat acc ttc acc gcc cac gat gcc gtg get tac gcg 48 
Met Ser Gin Tyr His Thr Phe Thr Ala His Asp Ala Val Ala Tyr Ala 
15 10 15 

caa cag ttc gcc ggc ate gac aac cca tct gag ctg gtc age gcg cag 96 
Gin Gin Phe Ala Gly He Asp Asn Pro Ser Glu Leu Val Ser Ala Gin 
20 25 30 

gaa gtg ggc gat ggc aac etc aat ctg gtg ttt aaa gtg ttc gat cgt 144 
Glu Val Gly Asp Gly Asn Leu Asn Leu Val Phe Lys Val Phe Asp Arg 
35 40 45 

cag ggc gtc age egg gcg ate gtc aaa cag gcc ctg ccc tac gtg cgc 192 
Gin Gly Val Ser Arg Ala He Val Lys Gin Ala Leu Pro Tyr Val Arg 
50 55 60 

tgc gtc ggc gaa tec tgg ccg ctg acc etc gac cgc gcc cgt etc gaa 240 
Cys Val Gly Glu Ser Trp Pro Leu Thr Leu Asp Arg Ala Arg Leu Glu 
65 70 75 80 
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909 cag acc c^g gtc gcc cac -ta-b cag cac age ccg cag cac acg gta 268 
Ala Gin Thr Leu VaL Ala His Tyr Gin His Ser Pro Gin His Thr Val 
85 90 95 

aaa ate eat. cac ttt gat. ccc gag ctg gcg gtg atg gtg at.g gaa gat. 336 
Iiys lie His His Phe Asp Pro Glu I»eu Ala Val Met Val Met Glu Asp 
100 105 110 

ctt tec gac cac cgc ate tgg cgc gga gag ctt ate get aac gtc tac 384 
Leu Ser Asp His Arg lie Trp Arg Gly Glu Leu He Ala Asn Val Tyr 
115 120 125 

tat ccc cag gcg gcc cgc cag ctt ggc gac tat ctg gcg cag gtg ttg 432 
Tyr Pro Gin Ala Ala Arg Gin Leu Gly Asp Tyr Leu Ala Gin Val Leu 
130 135 140 

ttc cac acc age gat ttc tac etc cat ccc cac gag aaa aag gcg cag 480 
Phe His Thr Ser Asp Phe Tyr Leu His Pro His Glu Lys Lys Ala Gin 
145 150 155 160 

gtg gcg cag ttt att aac ccg gcg atg tgc gag ate acc gag gat ctg 528 
Val Ala Gin Phe He Asn Pro Ala Met Cys Glu He Thr Glu Asp Leu 
165 170 175 

ttc ttt aac gac ccg tat cag ate cac gag cgc aat aac tac ccg gcg 576 
Phe Phe Asn Asp Pro Tyr Gin He His Glu Arg Asn Asn Tyr Pro Ala 
180 185 190 

gag ctg gag gcc gat gtc gcc gcc ctg cgc gac gac gcc cag ctt aag 624 
Glu Leu Glu Ala Asp Val Ala Ala Leu Arg Asp Asp Ala Gin Leu Lys 
195 200 205 

ctg gcg gtg gcg gcg ctg aag cac cgt ttc ttt gcc cat gcg gaa gcg 672 
Leu Ala Val Ala Ala Leu Lys His Arg Phe Phe Ala His Ala Glu Ala 
210 215 220 

ctg ctg cac ggc gat ate cac age ggg teg ate ttc gtt gcc gaa ggt 720 
Leu Leu His Gly Asp He His Ser Gly Ser He Phe Val Ala Glu Gly 
225 230 235 240 

age ctg aag gcc ate gac gcc gag ttc ggc tac ttc ggc ccc ate ggc 768 
Ser Leu Lys Ala He Asp Ala Glu Phe Gly Tyr Phe Gly Pro He Gly 
245 250 255 

ttc gat ate ggc acc gcc ate ggc aac ctg ctg ctg aac tac tgc ggc 816 
Phe Asp He Gly Thr Ala He Gly Asn Leu Leu Leu Asn Tyr Cys Gly 
260 265 270 

ctg ccg ggc cag etc ggc att cgc gat gcc gcc gcc gcg cgc gag cag 864 
Leu Pro Gly Gin Leu Gly He Arg Asp Ala Ala Ala Ala Arg Glu Gin 
275 280 285 

egg ctg aac gac ate cac cag ctg tgg acc acc ttc gcc gag cgc ttc 912 
Arg Leu Asn Asp He His Gin Leu Trp Thr Thr Phe Ala Glu Arg Phe 
290 295 300 

cag gcg ctg gcg gcg gag aaa acc cgc gac gcg gcg ctg get tac ccc 960 
Gin Ala Leu Ala Ala Glu Lys Thr Arg Asp Ala Ala Leu Ala Tyr Pro 
305 310 315 320 

ggc tac gcc tec gcc ttt ctg aag aaa gtc tgg gcg gac gcg gtc ggc 1008 
Gly Tyr Ala Ser Ala Phe Leu Lys Lys Val Trp Ala Asp Ala Val Gly 
325 330 335 

ttc tgc ggc age gaa ctg ate cgc cgc age gtc gga ctg teg cac gtc 1056 
Phe Cys Gly Ser Glu Leu He Arg Arg Ser Val Gly Leu Ser His Val 
340 345 350 
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gcg gat ate gac act ate cag gac gac gcc atg cgt cat gag tgc ctg 1104 
Ala Asp He Asp Thr He Gin Asp Asp Ala Met Arg His Glu Cys Leu 
355 360 365 

cgc cac gcc att acc ctg ggc aga gcg ctg ate gtg ctg gcc gag cgt 1152 
Arg His Ala He Thr Leu Gly Arg Ala Leu He Val Leu Ala Glu Arg 
370 375 380 

ate gac age gtc gac gag ctg ctg gcg egg gta cgc cag tac age tga 1200 
He Asp Ser Val Asp Glu Leu Leu Ala Arg Val Arg Gin Tyr Ser 
385 390 395 

<210> 40 
<211> 399 
<2X2> PRT 

<213> Klebsiella pneumoniae 
<400> 40 

Met Ser Gin Tyr His Thr Phe Thr Ala His Asp Ala Val Ala Tyr Ala 
15 10 15 

Gin Gin Phe Ala Gly He Asp Asn Pro Ser Glu Leu Val Ser Ala Gin 
20 25 30 

Glu Val Gly Asp Gly Asn Leu Asn Leu Val Phe Lys Val Phe Asp Arg 
35 40 45 

Gin Gly Val Ser Arg Ala He Val Lys Gin Ala Leu Pro Tyr Val Arg 
50 55 60 

Cys Val Gly Glu Ser Trp Pro Leu Thr Leu Asp Arg Ala Arg Leu Glu 
65 70 75 80 

Ala Gin Thr Leu Val Ala His Tyr Gin His Ser Pro Gin His Thr Val 
85 90 95 

Lys He His His Phe Asp Pro Glu I*eu Ala Val Met Val Met Glu Asp 
100 105 110 

Leu Ser Asp His Arg He Trp Arg Gly Glu Leu He Ala Asn Val Tyr 
115 120 125 

Tyr Pro Gin Ala Ala Arg Gin Leu Gly Asp Tyr Leu Ala Gin Val Leu 
130 135 140 

Phe His Thr Ser Asp Phe Tyr Leu His Pro His Glu Lys Lys Ala Gin 
145 150 155 160 

Val Ala Gin Phe He Asn Pro Ala Met Cys Glu He Thr Glu Asp Leu 
165 170 175 

Phe Phe Asn Asp Pro Tyr Gin He His Glu Arg Asn Asn Tyr Pro Ala 
X80 185 190 

Glu Leu Glu Ala Asp Val Ala Ala Leu Arg Asp Asp Ala Gin Leu Lys 
195 200 205 

Leu Ala Val Ala Ala Leu Lys His Arg Phe Phe Ala His Ala Glu Ala 
210 215 220 

Leu Leu His Gly Asp He His Ser Gly Ser He Phe Val Ala Glu Gly 
225 230 235 240 

Ser Leu Lys Ala He Asp Ala Glu Phe Gly Tyr Phe Gly Pro He Gly 
245 250 255 

Phe Asp He Gly Thr Ala He Gly Asn Leu Leu Leu Asn Tyr Cys Gly 
260 265 270 
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I^eu Pro Gly Gin Leu Gly lie Arg Asp Ala Ala Ala Ala Arg GXu Gin 
275 280 2B5 

Arg liGu Asn Asp lie His Gin Leu Trp Thr Thr Phe Ala Glu Arg Phe 
290 295 300 

Gin Ala Leu Ala Ala Glu Lys Thr Arg Asp Ala Ala Leu Ala Tyr Pro 
305 310 315 320 

Gly Tyr Ala Ser Ala Phe Leu Lys Lys Val Trp Ala Asp Ala Val Gly 
325 330 335 

Phe Cys Gly Ser Glu Leu lie Arg Arg Ser Val Gly Leu Ser His Val 
340 345 350 

Ala Asp He Asp Thr He Gin Asp Asp Ala Ke-b Arg Bxs Glu Cys Leu 
355 360 365 

Arg His Ala He Thr Leu Gly Arg Ala Leu Xle Val Leu Ala Glu Arg 
370 375 380 

He Asp Ser Val Asp Glu Leu Leu Ala Arg Val Arg Gin Tyr Ser 
385 390 395 



<210> 41 
<211> 1140 
<212> DNA 

<213> Arabidopsis thaliana 

<220> 

<221> CDS 

<222> (1).*(1137) 

<223> coding for alcohol dehydrogenase 
<400> 41 

atg tc^ acc acc gga cag at^ att: cga tgc aaa get get. gtg gca tgg 48 
Met Ser Thr Thr Gly Gin He He Arg Cys Lys Ala Ala Val Ala Trp 
15 10 15 

gaa gcc gga aag cca ctg gtg ate gag gaa gtg gag gtt get cca ecg 96 
Glu Ala Gly Lys Pro Leu Val He Glu Glu Val Glu VaL Ala Pro Pro 
20 25 30 

cag aaa cac gaa gtt cgt ate aag att cte ttc act tct etc tgt cac 144 
Gin Lys His Glu Val Arg He Lys He Leu Phe Thr Ser Leu Cys His 
35 40 45 

acc gat gtt tac ttc tgg gaa get aag gga caa aca cog ttg ttt. cca 192 
Thr Asp Val Tyr Phe Trp Glu Ala I*ys Gly Gin Thr Pro Leu Phe Pro 
50 55 60 

cgt ate ttc ggc cat gaa get gga ggg att gtt gag agt gtt gga gaa 240 
Arg He Phe Gly His Glu Ala Gly Gly He Val Glu Ser Val Gly Glu 
65 70 75 80 

gga gtg act gat ctt cag cca gga gat cat gtg ttg ecg ate ttt acc 288 
Gly Val Thr Asp Leu Gin Pro Gly Asp His Val Leu Pro He Phe Thr 
85 90 95 

gga gaa tgt gga gat tgt cgt cat tgc cag teg gag gaa tea aac atg 336 
Gly Glu Cys Gly Asp Cys Arg His Cys Gin Ser Glu Glu Ser Asn Met 
100 105 110 

tgt gat ctt etc agg ate aac aca gag cga gga ggt atg att cac gat 384 
Cys Asp Leu Leu Arg He Asn Thr Glu Arg Gly Gly Met He His Asp 
115 120 125 



CA 02493364 2005-01-21 



PF 53790 



62 

ggt gaa tct aga ttc tec att aat ggc aaa cca ate tac cat ttc ctt 432 
Gly Glu Ser Arg Phe Ser lie Asn Gly Lys Pro He Tyr His Phe Leu 
130 135 140 

ggg acg tec acg ttc agt gag tac act gtg gtt cac tct ggt cag gtc 480 
Gly Thr Ser Thr Phe Ser Glu Tyr Thr Val Val His Ser Gly Gin Val 
145 150 155 160 

get aag ate aat ecg gat get cct ctt gac aag gtc tgt att gtc agt 528 
Ala Lys He Asn Pro Asp Ala Pro Leu Asp Lys Val Cys He Val Ser 
165 170 175 

tgt ggt ttg tct act ggg tta gga gca act ttg aat gtg get aaa ccc 57 6 
Cys Gly Leu Ser Thr Gly Leu Gly Ala Thr Leu Asn Val Ala Lys Pro 
180 185 190 

aag aaa ggt caa agt gtt gee att ttt ggt ctt ggt get gtt ggt tta 624 
Lys Lys Gly Gin Ser Val Ala He Phe Gly Leu Gly Ala Val Gly Leu 
195 200 205 

ggc get gca gaa ggt get aga ate get ggt get tct agg ate ate ggt 672 
Gly Ala Ala Glu Gly Ala Arg He Ala Gly Ala Ser Arg He He Gly 
210 215 220 

gtt gat ttt aac tct aaa aga ttc gac caa get aag gaa ttc ggt gtg 720 
Val Asp Phe Asn Ser Lys Arg Phe Asp Gin Ala Lys Glu Phe Gly Val 
225 230 235 240 

acc gag tgt gtg aac cog aaa gac cat gac aag cca att caa cag gtg 768 
Thr Glu Cys Val Asn Pro Lys Asp His Asp Lys Pro He Gin Gin Val 
245 250 255 

ate get gag atg acg gat ggt ggg gtg gac agg agt gtg gaa tgc acc B16 
He Ala Glu Met Thr Asp Gly Gly Val Asp Arg Ser Val Glu Cys Thr 
260 265 270 

gga age gtt cag gee atg att caa gca ttt gaa tgt gtc cac gat ggc B64 
Gly Ser Val Gin Ala Met He Gin Ala Phe Glu Cys Val His Asp Gly 
275 280 285 

tgg ggt gtt gca gtg ctg gtg ggt gtg cca age aaa gac gat gee ttc 912 
Trp Gly Val Ala Val Leu Val Gly Val Pro Ser Lys Asp Asp Ala Phe 
290 295 300 

aag act cat ccg atg aat ttc ttg aat gag agg act ctt aag ggt act 960 
Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 320 

ttc ttc ggg aac tac aaa ccc aaa act gac att ccc ggg gtt gtg gaa 1008 
Phe Phe Gly Asn Tyr Lys Pro Lys Thr Asp He Pro Gly Val Val Glu 
325 330 335 

aag tac atg aac aag gag ctg gag ctt gag aaa ttc ate act cac aca 1056 
Lys Tyr Met Asn Lys Glu Leu Glu Leu Glu Lys Phe He Thr His Thr 
340 345 350 

gtg cca ttc teg gaa ate aac aag gcc ttt gat tac atg ctg aag gga 1104 
Val Pro Phe Ser Glu He Asn Lys Ala Phe Asp Tyr Met Leu Lys Gly 
355 360 365 

gag agt att cgt tgc ate ate acc atg ggt get tga 1140 
Glu Ser He Arg Cys Xle He Thr Met Gly Ala 
370 375 

<:210> 42 
<211> 379 
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<212> PRT 

<213> Arabidopsis thallana 
<400> 42 

Met 5er Thr Thr Gly GXn lie lie Arg Cys Lys Ala Ala Val Ala Trp 
15 10 15 

Glu Ala Gly tys Pro Leu Val Tie Glu Glu Val Glu Val Ala Pro Fro 
20 25 30 

Gin Lys His Glu Val Arg lie Lys lie Leu Phe Thr Ser Leu Cys His 
35 40 45 

Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin Thr Pro Leu Phe Pro 
50 55 60 

Arg lie Phe Gly His Glu Ala Gly Gly He Val Glu Ser Val Gly Glu 
65 70 75 80 

Gly Val Thr Asp Leu Gin Pro Gly Asp His Val Leu Pro He Phe Thr 
85 90 95 

Gly Glu Cys Gly Asp Cys Arg His Cys Gin Ser Glu Glu Ser Asn Ket, 
100 105 110 

Cys Asp Leu Leu Arg lie Asn Thr Glu Arg Gly Gly Met: lie R±s Asp 
115 120 125 

Gly Glu Ser Arg Phe Ser He Asn Gly Lys Pro He Tyr His Phe Leu 
130 135 140 

Gly Thr Ser Thr Phe Ser Glu Tyr Thr Val Val His Ser Gly Gin Val 
145 150 155 160 

Ala Lys He Asn Pro Asp Ala Pro Leu Asp Lys Val Cys He Val Ser 
165 170 175 

Cys Gly Leu Ser Thr Gly Leu Gly Ala Thr Leu Asn Val Ala Lys Pro 
180 185 190 

Lys Lys Gly Gin Ser Val Ala He Phe Gly Leu Gly Ala Val Gly Leu 
195 200 205 

Gly Ala Ala Glu Gly Ala Arg He Ala Gly Ala Ser Arg He He Gly 
210 215 220 

Val Asp Phe Asn Ser Lys Arg Phe Asp Gin Ala Lys Glu Phe Gly Val 
225 230 235 240 

Thr Glu Cys Val Asn Pro Lys Asp His Asp Lys Pro He Gin Gin Val 
245 250 255 

He Ala Glu Met, Thr Asp Gly Gly Val Asp Arg Ser Val Glu Cys Thr 
260 265 270 

Gly Ser Val Gin Ala Met He Gin Ala Phe Glu Cys Val His Asp Gly 
275 280 285 

Tirp Gly Val Ala Val Leu Val Gly Val Pro Ser Lys Asp Asp Ala Phe 
290 295 300 

Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 320 

Phe Phe Gly Asn Tyr Lys Pro Lys Thr Asp He Pro Gly Val Val Glu 
325 330 335 

Lys Tyr Met Asn Lys Glu Leu Glu Leu Glu Lys Phe He Thr His Thr 
340 345 350 
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Val Pro Phe Ser Glu He Asn Lys Ala Phe Asp Tyr Met Leu Lys Gly 
355 360 365 

Glu Ser He Arg Cys He He Thr Met Gly Ala 
370 375 



48 



96 



<210> 43 
<211> 1140 
<212> DNA 

<213> Hordevua vulgare 

<220> 
<221> CDS 
<222> (1)..(1137) 

<223> coding for alcohol dehydrogenase 
<400> 43 

atg gcg acg gcc ggc aag gtg ate aag tgc aaa gcc gcg gtg gcg tgg 
Met Ala Thr Ala Gly Lys Val He Lys Cys Lys Ala Ala Val Ala Trp 
15 10 15 

gag gcc ggg aag ccg ctg acc atg gag gag gtg gag gtg gcg ccg ccg 
Glu Ala Gly Lys Pro Leu Thr Met Glu Glu Val Glu Val Ala Pro Pro 
20 25 30 

cag gcc atg gag gtg cgc gtc aag ate etc tte acc tee etc tgc cae 144 
Gin Ala Met Glu Val Arg Val Lys He Leu Phe Thr Ser Leu Cys His 
35 40 45 

acc gac gtc tac ttc tgg gag gcc aag ggg cag acc ccc atg ttc cct 192 
Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin Thr Pro Met Phe Pro 
50 55 60 

egg ate ttc ggc cat gaa get gga ggc ata gtg gag agt gtt gga gag 240 
Arg He Phe Gly His Glu Ala Gly Gly He Val Glu Ser Val Gly Glu 
65 70 75 80 

ggc gtg act gat gtt gcc cct ggt gac cae gtc etc cct gtg ttc act 288 
Gly Val Thr Asp Val Ala Pro Gly Asp His Val Leu Pro Val Phe Thr 
85 90 95 

ggg gag tgt aag gaa tgc cca cat tgc aag tct gcg gag age aac atg 336 
Gly Glu Cys Lys Glu Cys Pro His Cys Lys Ser Ala Glu Ser Asn Met 
100 105 110 

tgt gat etg etc agg ate aac acc gac aga ggt gtg atg ate ggg gat 384 
Cys Asp Leu Leu Arg He Asn Thr Asp Arg Gly Val Met He Gly Asp 
115 120 125 

ggc aag teg cgc tte tct att ggc ggc aag ccg att tac cat ttc gta 432 
Gly Lys Ser Arg Phe Ser He Gly Gly Lys Pro He Tyr His Phe Val 
130 135 140 

ggg act tec acc ttc agt gag tac act gtc atg cat gtc ggt tgt gtt 480 
Gly Thr Ser Thr Phe Ser Glu Tyr Thr Val Met His Val Gly Cys Val 
145 150 155 160 

gcc aag ate aac cct gag get ccc ctt gat aaa gtc tgt gtt ctt age 528 
Ala Lys He Asn Pro Glu Ala Pro Leu Asp Lys Val Cys Val Leu Ser 
165 170 175 



tgt ggt att tgc act ggt ctt ggc gcg tea att aat gtt gca aaa cca 
Cys Gly He Cys Thr Gly Leu Gly Ala Ser He Asn Val Ala Lys Pro 
180 185 190 



576 
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cca aag ggt:. tec aca gtg gcg ata ttt ggg eta gga get gtt ggc ett 624 
Fro Lys Gly Ser Thr Val Ala Xle Fhe Gly X.eu Gly Ala Val Gly Leu 
195 200 205 

get get gca gaa ggt gca agg att gca ggt gca tea agg ate att ggt 672 
Ala Ala Ala Glu Gly Ala Arg lie Ala Gly Ala Ser Arg lie lie Gly 
210 215 220 

gtt gae ctg aae gee age aga ttt gaa gag get agg aag ttt gge tgc 720 
Val Asp Leu Asn Ala Ser Arg Phe Glu Glu Ala Arg Lys Phe Gly Cys 
225 230 235 240 

acg gaa ttt gtg aac ccg aaa gat cac acc aag cca gtt cag eag gtg 768 
Thr Glu Phe Val Asn Pro Lys Asp His Thr Lys Pro Val Gin Gin Val 
245 250 255 

etc get gae atg aca aat ggc gga gtt gac cgc agt gtt gag tgc act 816 
Leu Ala Asp Met Thr Asn Gly Gly Val Asp Arg Ser Val Glu Cys Thr 
260 265 270 

ggc aac gtc aat get atg ata eaa gca ttt gaa tgt gtt cat gat ggc 864 
Gly Asn val Asn Ala Met He Gin Ala Phe Glu Cys Val His Asp Gly 
275 280 285 

tgg ggt gta get gtg ctg gtg ggt gtg cca cac aag gac get gaa ttc 912 
Trp Gly Val Ala Val Leu Val Gly Val Pro His Lys Asp Ala Glu Phe 
290 295 300 

aag acc cac ccg atg aac ttc ctg aat gag agg acc ctg aag ggc acc 960 
Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 320 

ttc ttc ggt aae ttc aag ccg cgc act gac ctg ecc aat gtc gtg gag 1008 
Phe Phe Gly Asn Phe Lys Pro Arg Thr Asp Leu Pro Asn Val Val Glu 
325 330 335 

atg tac atg aag aag gag ctg gag gtg gag aag ttc ate aca cac age 10 56 
Met Tyr Met Lys Lys Glu Leu Glu Val Glu Lys Plie He Thr His Ser 
340 345 350 

gtg ccg ttc teg gag ata aae aag gcc ttc gac ctt atg gcg aag ggg 1104 
Val Pro Phe Ser Glu He Asn Lys Ala Phe Asp Leu Met Ala Lys Gly 
355 360 365 

gag ggc ate cgt tgc ate ate cgc atg gac aac tag 114 0 

Glu Gly He Arg Cys He He Arg Met Asp Asn 
370 375 

<210> 44 
<211> 379 
<212> PRT 

<213> Hordeum vnlgare 
<400> 44 

Met Ala Thr Ala Gly Lys Val He Lys Cys Lys Ala Ala Val Ala Trp 
15 10 15 

Glu Ala Gly Lys Pro Leu Thr Met Glu Glu Val Glu Val Ala Pro Pro 
20 25 30 

Gin Ala Met Glu Val Arg Val Lys He Leu Phe Thr Ser Leu Cys His 
35 40 45 



Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin Thr Pro Met Phe Pro 
50 55 60 
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Arg lie Phe Gly His Glu Ala Gly Gly lie Val Glu Ser Val Gly Glu 
65 70 75 80 

Gly Val Thr Asp Val Ala Pro Gly Asp His Val I»eu Pro Val Phe Thr 
85 90 95 

Gly Glu Cys Lys Glu Cys Pro His Cys Lys Ser Ala Glu Ser Asn Met 
100 105 110 

Cys Asp Leu Leu Arg lie Asn Thr Asp Arg Gly Val Met lie Gly Asp 
115 120 125 

Gly Lys Ser Arg Phe Ser He Gly Gly Lys Pro lie Tyr His Phe Val 
130 135 140 

Gly Thr Ser Thr Phe Ser Glu Tyr Thr Val Met Bis Val Gly Cys Val 
145 150 155 160 

Ala Lys He Asn Pro Glu Ala Pro Leu Asp Lys Val Cys Val Leu Ser 
165 170 175 

Cys Gly He Cys Thr Gly Leu Gly Ala Ser He Asn Val Ala Lys Pro 

180 185 190 

Pro Lys Gly Ser Thr Val Ala He Phe Gly Leu Gly Ala Val Gly Leu 
195 200 205 

Ala Ala Ala Glu Gly Ala Axg He Ala Gly Ala Ser Arg He He Gly 
210 215 220 

Val Asp Leu Asn Ala Ser Arg Phe Glu Glu Ala Arg Lys Phe Gly Cys 
225 230 235 240 

Thr Glu Phe Val Asn Pro Lys Asp His Thr Lys Pro Val Gin Gin Val 
245 250 255 

Leu Ala Asp Met Thr Asn Gly Gly Val Asp Arg Ser Val Glu Cys Thr 
260 265 270 

Gly Asn Val Asn Ala Met He Gin Ala Phe Glu Cys Val His Asp Gly 

275 280 285 

Trp Gly Val Ala Val Leu Val Gly Val Pro His Lys Asp Ala Glu Phe 

290 295 300 

Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 320 

Phe Phe Gly Asn Phe Lys Pro Arg Thr Asp Leu Pro Asn Val Val Glu 

325 330 335 

Met Tyr Met Lys Lys Glu Leu Glu Val Glu Lys Phe He Thr His Ser 
340 345 350 

Val Pro Phe Ser Glu He Asn Lys Ala Phe Asp Leu Met Ala Lys Gly 
355 360 365 

Glu Gly He Arg Cys He He Arg Met Asp Asn 
370 375 



<210> 45 

<211> 1140 

<212> DNA 

<213> Oryza sativa 

<220> 
<221> CDS 
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<222> (I).. (1137) 

<223> coding for alcohol dehydrogenase 
<400> 45 

atg gcg acc gca ggg aag gtg ate aag tgc aaa gcg gcg gtg gca tgg 48 
Met Ala Thr Ala Gly Lys Val lie liys Cys Lys Ala Ala Val Ala Trp 
15 10 15 

gag gcc gcg aag ccg ctg gtg ate gag gag gtg gag gtg gcg ccg ccg 96 
Glu Ala Ala Lys Pro Leu Val He Glu 61u Val Glu Val Ala Pro Pro 
20 25 30 

cag gcc atg gag gtg cgc gtc aag ate etc ttc acc teg etc tgc cac 144 
Gin Ala Met Glu Val Arg Val Lys Xle Leu Phe Thr Ser Leu Cys His 
35 40 45 

acc gac gtc t.ac ttc tgg gag gcc aag gga cag act ccc gtg ttc cct 192 
rrhr Asp Val Tyr Phe Xrp Glu Ala Lys Gly Gin Thr Pro Val Phe Pro 
50 55 60 

egg ate ttc ggc cat gaa get gga ggt att gtg gag agt gtt gga gag 240 
Arg He Phe Gly His Glu Ala Gly Gly He Val Glu Ser Val Gly Glu 
65 70 75 80 

ggt gtg act gat ett gcc cct ggt gac cat gtt etc cct gtg ttc act 288 
Gly Val Thr Asp Leu Ala Pro Gly Asp His Val Leu Pro Val Phe Thr 
B5 90 95 

ggg gag tgc aag gag tgt gcc cac tgc aag tea gca gag age aac atg 336 
Gly Glu Cys Lys Glu Cys Ala His Cys Lys Ser Ala Glu Ser Asn Met 
100 105 110 

tgt gat ctg etc agg ate aac act gac agg ggt gtg atg att ggt gat 384 
Cys Asp Leu Leu Arg He Asn Thr Asp Arg Gly Val Met He Gly Asp 
115 120 125 

ggc aaa tea cgc ttt tee ate aac ggg aag ccc att tae cat ttc gtc 432 
Gly Lys Ser Arg Phe Ser He Asn Gly Lys Fro He Tyr His Phe Val 
130 135 140 

ggg act teg acc ttc age gag tac act gtc atg cat gtt ggt tgc gtt 480 
Gly Thr Ser Thr Phe Ser Glu Tyr Thr Val Met His Val Gly Cys Val 
145 150 155 160 

gcg aag ate aac ccg gca get eca ett gat aaa gtt tgc gtt ett age 528 
Ala Lys He Asn Pro Ala Ala Pro Leu Asp Lys Val Cys Val Leu Ser 
165 170 175 

tgt ggt att tct act ggt ett ggt get aca ate aat gtg gca aag cca 576 
Cys Gly He Ser Thr Gly Leu Gly Ala Thr He Asn Val Ala Lys Pro 
180 IBS 190 

cca aag ggt teg acg gtg gcg ata ttt ggt eta gga get gta ggc ett 624 
Pro Lys Gly Ser Thr Val Ala He Phe Gly Leu Gly Ala Val Gly Leu 
195 200 205 

get gcc gca gaa ggt gca agg att gca gga gcg tea agg ate att ggc 672 
Ala Ala Ala Glu Gly Ala Arg He Ala Gly Ala Ser Arg He He Gly 
210 215 220 

att gac ctg aac gcc aac aga ttt gaa gaa get agg aaa ttt ggt tgc 720 
He Asp Leu Asn Ala Asn Arg Phe Glu Glu Ala Arg Lys Phe Gly Cys 
225 230 235 240 

act gaa ttt gtg aac cca aag gac cat gac aag cca gtt cag cag gta 7 68 
Thr Glu Phe Val Asn Pro Lys Asp His Asp Lys Pro Val Gin Gin Val 
245 250 255 
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ctt get gag atg acc aat ggc gga gtt gac cgc age gtt gaa tgc act 816 
Leu AJLa Glu Met Thr Asn Gly Gly Val Asp Arg Ser Val Glu Cys Tlir 
260 265 270 

ggc aac ate aac gee atg ate caa gca ttt gaa tgt gtt eat gat ggc 864 
Gly Asn He Asn Ala Met He Gin Ala Phe Glu Cys Val His Asp Gly 
275 280 285 

"tgg ggt gtt get gtt ttg gtc ggc gtg cca cac aag gac gcc gag ttc 912 
Trp Gly Val Ala Val I^u Val Gly Val Pro His Lys Asp Ala Glu Phe 
290 295 300 

aag acc cac ccg atg aac ttc ctg aac gag agg act etc aag gga acc 960 
Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 320 

ttc ttc ggc aac tac aag cca cgc acc gat ctg ccc aac gtc gtc gag 1008 
Phe Phe Gly Asn Tyr Lys Pro Arg Thr Asp Leu Pro Asn Val Val Glu 
325 330 335 

etc tac atg aag aag gag ctg gag gtg gag aag ttc ate aca cac age 1056 
Leu Tyr Met Lys Lys Glu Leu Glu Val Glu Lys Phe He Thr His Ser 
340 345 350 

gtg ccg ttc teg gag ate aac acg gcg ttc gac ctg atg cac aag ggc 1104 
Val Pro Phe Ser Glu lie Asn Thr Ala Phe Asp Leu Het His Lys Gly 
355 360 365 

gag ggc ate cgc tgc ate ate cgc atg gag aac tga 1140 
Glu Gly Xle Arg Cys lie He Arg Met Glu Asn 
370 375 

<210> 46 

<211> 379 

<212> PRT 

<213> Oryza sativa 

<400> 46 

Met Ala Thr Ala Gly Lys Val He Lys Cys Lys Ala Ala Val Ala Trp 
15 10 15 

Glu Ala Ala Lys Pro Leu Val Xle Glu Glu Val Glu Val Ala Pro Pro 
20 25 30 

Gin Ala Het Glu Val Arg Val Lys Xle Leu Phe Thr Ser Leu Cys His 
35 40 45 

Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin Thr Pro Val Phe Pro 
50 55 60 

Arg Xle Phe Gly His Glu Ala Gly Gly He Val Glu Ser Val Gly Glu 
65 70 75 80 

Gly Val Thr Asp Leu Ala Pro Gly Asp His Val Leu Pro Val Phe Thr 
85 90 95 

Gly Glu Cys Lys Glu Cys Ala His Cys Lys Ser Ala Glu Ser Asn Met 
100 105 110 

Cys Asp Leu Leu Arg He Asn Thr Asp Arg Gly Val Met He Gly Asp 
115 120 125 

Gly Lys Ser Arg Phe Ser He Asn Gly Lys Pro He Tyr His Phe Val 
130 135 140 

Gly Thr Ser Thr Phe Ser Glu Tyr Thr Val Met His Val Gly Cys Val 
145 150 155 160 
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Ala Lys lie Asn Pro Ala Ala Pro Leu Asp Lys Val Cys Val Leu Ser 
165 170 175 

Cys Gly lie Ser Thr Gly Leu Gly Ala Thr He Asn Val Ala Lys Pro 
180 185 190 

Pro Lys Gly Ser Thr Val Ala He Phe Gly Leu Gly Ala Val Gly Leu 
195 200 205 

Ala Ala Ala Glu Gly Ala Arg He Ala Gly Ala Ser Arg He He Gly 
210 215 220 

He Asp Leu Asn Ala Asn Arg Phe Glu Glu Ala Arg Lys Phe Gly Cys 
225 230 235 240 

Thr Glu Phe Val Asn Pro Lys Asp His Asp Lys Pro Val Gin Gin Val 
245 250 255 

Leu Ala Glu Met: Thr Asn Gly Gly Val Asp Arg Ser Val Glu Cys Thr 
260 265 270 

Gly Asn He Asn Ala Met He Gin Ala Phe Glu Cys Val His Asp Gly 
275 280 285 

Trp Gly Val Ala Val Leu Val Gly Val Pro His Lys Asp Ala Glu Phe 
290 295 300 

Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 320 

Phe Phe Gly Asn Tyr Lys Pro Arg Thr Asp Leu Pro, Asn Val Val Glu 
325 330 335 

Leu Tyr Met Lys Lys Glu Leu Glu Val Glu Lys Phe He Thr His Ser 
340 345 350 

Val Pro Phe Ser Glu He Asn Thr Ala Phe Asp Leu Met His Lys Gly 
355 360 365 

Glu Gly He Arg Cys He He Arg Met Glu Asn 
370 375 



<210> 47 
<211> 1140 
<212> DNA 
<213> Zea mays 

<220> 

<221> CDS 

<222> (1)..(1137) 

<223> coding for alcohol dehydrogenase 
<400> 47 

atg gcg acc gcg ggg aag gtg ate aag tgc aaa get gcg gtg gca tgg 48 
Met Ala Thr Ala Gly Lys Val He Lys Cys Lys Ala Ala Val Ala Trp 
15 10 15 

gag gcc ggc aag oca ctg teg ate gag gag gtg gag gta gcg cct ccg 96 
Glu Ala Gly Lys Pro Leu Ser He Glu Glu Val Glu Val Ala Pro Pro 
20 25 30 

cag gcc atg gag gtg cgc gtc aag ate etc ttc acc teg etc tgc cac 144 
Gin Ala Met Glu Val Arg Val Lys He Leu Phe Thr Ser Leu Cys His 
35 40 45 
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acc gac gtc "tac ttc tgg gag gcc aag ggg cag act ccc gtg ttc cct. 192 
Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin Thr Pro Val Phe Pro 
50 55 60 

egg ate ttt ggc cat gag get gga ggt ate ata gag agt gtt gga gag 240 
Arg He Phe Gly His Glu Ala Gly Gly He He Glu Ser Val Gly Glu 
65 70 75 80 

ggt gtg act gac gta get ccg ggc gac cat gtc ctt cct gtg ttc act 288 
Gly Val Thr Asp Val Ala Pro Gly Asp His Val Leu Pro Val Phe Thr 
85 90 95 

ggg gag tgc aag gag tgc gcc cac tge aag teg gca gag age aac atg 336 
Gly Glu Cys Lys Glu Cys Ala His Cys Lys Ser Ala Glu Ser Asn Met 
100 105 110 

tgt gat ttg etc agg ate aac act gac cgc ggt gtg atg att ggc gat 384 
Cys Asp Leu Leu Arg He Asn Thr Asp Arg Gly Val Met He Gly Asp 
115 120 125 

ggc aag teg egg ttt tea ate aat ggg aag cct ate tac cac ttt gtt 432 
Gly Lys Ser Arg Phe Ser He Asn Gly Lys Pro He Tyr His Phe Val 
130 135 140 

ggg act tec acc ttc age gag tac ace gte atg cat gte ggt tgt gtt 480 
Gly Thr Ser Thr Phe Ser Glu Tyr Thr Val Met His Val Gly Cys Val 
145 150 155 160 

gca aag ate aac cct cag get ccc ctt gat aaa gtt tgc gte ctt age 528 
Ala Lys He Asn Pro Gin Ala Pro Leu Asp Lys Val Cys Val Leu Ser 

165 170 175 

tgt ggt att tet act ggt ctt ggt gca tea att aat gtt gca aaa cct 576 
Cys Gly He Ser Thr Gly Leu Gly Ala Ser He Asn Val Ala Lys Pro 
180 185 190 

ccg aag ggt teg aca gtg get gtt ttc ggt tta gga gee gtt ggt ctt 624 
Pro Lys Gly Ser Thr Val. Ala Val Phe Gly Leu Gly Ala Val Gly Leu 
195 200 205 

gee get gca gaa ggt gca agg att get gga geg tea agg ate att ggt 67 2 
Ala Ala Ala Glu Gly Ala Arg He Ala Gly Ala Ser Arg He He Gly 
210 215 220 

gte gac ctg aac ccc age aga ttc gaa gaa get agg aag ttc ggt tgc 720 
Val Asp Leu Asn Pro Ser Arg Phe Glu Glu Ala Arg Lys Phe Gly Cys 
225 230 235 240 

act gaa ttt gtg aac cca aaa gac cac aac aag ccg gtg cag gag gta 768 
Thr Glu Phe Val Asn Pro Lys Asp His Asn Lys Pro Val Gin Glu Val 
245 250 255 

ctt get gag atg acc aac gga ggg gte gac ege age gtg gaa tgc act 816 
Leu Ala Glu Met Thr Asn Gly Gly Val Asp Arg Ser Val Glu Cys Thr 
260 265 270 

ggc aac ate aat get atg ate caa get ttc gaa tgt gtt eat gat ggc 864 
Gly Asn He Asn Ala Met He Gin Ala Phe Glu Cys Val His Asp Gly 
275 280 285 

tgg ggt gtt gcc gtg ctg gtg ggt gtg ccg cat aag gac get gag ttc 912 
Trp Gly Val Ala Val Leu Val Gly Val Pro His Lys Asp Ala Glu Phe 

290 295 300 

aag ace cac ccg atg aac ttc ctg aac gaa agg acc ctg aag ggg acc 960 
Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 320 
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"ttc ttt. ggc aac tat aag cca cgc act gat ctg cca aat gtg gtg gag 1008 
Phe Phe Gly Asn Tyr Lys Pro Arg Thr Asp I/eu Pro Asn Val Val Glu 
325 330 335 

ctg tac atg aaa aag gag ctg gag gtg gag aag ttc ate acg cac age 1056 
Leu Tyr Met Lys Lys Glu Leu Glu Val Glu Lys Phe lie Thr His Ser 
340 345 350 

gtc ccg. ttc gcg gag ate aac aag gcg ttc aac ctg atg gcc aag ggg 1104 
Val Pro Phe Ala Glu Xle Asn Lys Ala Phe Asxx Leu Met Ala Lys Gly 
355 360 365 

gag ggc ate cgc tgc ate a^c cgc atg gag aac tag 1140 
Glu Gly lie Arg Cys lie lie Arg Met Glu Asn 
370 375 

<210> 48 
<211> 379 
<212> PRT 
<213> Zea mays 

<400> 48 

Met Ala Thr Ala Gly Lys Val Xle Lys Cys Lys Ala Ala Val Ala Trp 
15 10 15 

Glu Ala Gly Lys Pro Leu Ser lie Glu Glu Val Glu Val Ala Pro Pro 
20 25 30 

Gin Ala Met Glu Val Arg Val Lys Xle Leu Phe Thr Ser Leu Cys His 
35 40 45 

Thr Asp Val Tyr Phe Trp Glu Ala Lys Gly Gin Thr Pro Val Phe Pro 
50 55 60 

Arg He Phe Gly His Glu Ala Gly Gly Xle Xle Glu Ser Val Gly Glu 
65 70 75 80 

Gly Val Thr Asp Val Ala Pro Gly Asp His Val Leu Pro Val Phe Thr 
85 90 95 

Gly Glu Cys Lys Glu Cys Ala His Cys Lys Ser Ala Glu Ser Asn Met 
100 105 110 

Cys Asp Leu Leu Arg Xle Asn Thr Asp Arg Gly Val Met Xle Gly Asp 
115 120 125 

Gly Lys Ser Arg Phe Ser lie Asn Gly Lys Pro Xle Tyr His Phe Val 
130 135 140 

Gly Thr Ser Thr Phe Ser Glu Tyr Thr Val Met His Val Gly Cys Val 
145 150 155 160 

Ala Lys Xle Asn Pro Gin Ala Pro Leu Asp Lys Val Cys Val Leu Ser 
165 170 175 

Cys Gly He Ser Thr Gly Leu Gly Ala Ser lie Asn Val Ala Lys Pro 
180 185 190 

Pro Lys Gly Ser Thr Val Ala Val Phe Gly Leu Gly Ala Val Gly Leu 
195 200 205 

Ala Ala Ala Glu Gly Ala Arg lie Ala Gly Ala Ser Arg Xle Xle Gly 
210 215 220 

Val Asp Leu Asn Pro Ser Arg Phe Glu Glu Ala Arg Lys Phe Gly Cys 
225 230 235 240 
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Thr Glu Phe Val Asn Pro Lys Asp His Asn Lys Pro Val Gin Glu Val 
245 250 255 

Leu Ala Glu Met Thr Asn Gly Gly Val Asp Arg Ser Val Glu Cys Thr 
260 265 270 

Gly Asn lie Asn Ala Met lie Gin Ala Phe Glu Cys Val His Asp Gly 
275 280 285 

Trp Gly Val Ala Val Leu Val Gly Val Pro His Lys Asp Ala Glu Phe 
290 295 300 

Lys Thr His Pro Met Asn Phe Leu Asn Glu Arg Thr Leu Lys Gly Thr 
305 310 315 320 

Phe Phe Gly Asn Tyr Lys Pro Arg Thr Asp i^eu Pro Asn Val Val Glu 
325 330 335 

Leu Tyr Met Lys Lys Glu I^u Glu Val Glu Lys Phe lie Thr His Ser 
340 345 350 

Val Pro Phe Ala Glu lie Asn Lys Ala Phe Asn Leu Met Ala Lys Gly 
355 360 365 

Glu Gly Xle Arg Cys lie lie Arg Met Glu Asn 
370 375 



<210> 49 
<211> 505 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: 
sense RNA- fragment of E.coli codA gene 

<400> 49 
aagcttggct 
gcgaagaggg 
aatccggcgt 
cgccgtttgt 
ggaatcagtc 
taacccatga 
gcattcagca 
caatgctgga 
ctcaggaagg 



coding for 



aacagtgtcg 
gctgtggcag 
gatgcccata 
ggagccacat 
cggcacgctg 
cgatgtgaaa 
tgtgcgtacc 
agtgaagcag 
gattttgtcg 



aataacgctt 
attcatctgc 
actgaaaaca 
attcacctgg 
tttgaaggca 
caacgcgcat 
catgtcgatg 
gaagtcgcgc 
tcgac 



tacaaacaat 
aggacggaaa 
gcctggatgc 
acaccacgea 
ttgaacgctg 
ggcaaacgct 
tttcggatgc 
cgtggattga 



tattaacgcc 
aatcagcgcc 
cgaacaaggt 
aaccgccgga 
ggccgagcgc 
gaaatggcag 
aacgctaact 
tctgcaaatc 



cggttaccag 60 
attgatgcgc 120 
ttagttatac 180 
caaccgaact 240 
aaagcgttat 300 
attgccaacg 360 
gcgctgaaag 420 
gtcgccttcc 480 
505 



<210> 50 
<211> 27 
<212> DKA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: 
oligonucleotide primer 

<400> 50 

cgtgaatacg gcgtggagtc g 

<210> 51 
<211> 26 
<212> DNA 

<213> Artificial sequence 



21 
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<220> 

<223> Description of the artificial sequence: 
oligonucleotide primer 

<400> 51 

cggcaggata atcaggttgg 20 

<210> 52 
<211> 505 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: coding for 
antisense RKA-f ragment of B.coli codA gene 

<400> 52 

gaattcggct aacagtgtcg aataacgctt tacaaacaat tattaacgcc cggttaccag 60 
gcgaagaggg gctgtggcag attcatctgc aggacggaaa aatcagcgcc attgatgcgc 120 
aatccggcgt gatgcccata actgaaaaca gcctggatgc cgaacaaggt ttagttatac 180 
cgccgtttgt ggagccacat attcacctgg acaccacgca aaccgccgga caaccgaact 240 
ggaatcagtc cggcacgctg tttgaaggca ttgaacgctg ggccgagcgc aaagcgttat 300 
taacccatga cgatgtgaaa caacgcgcat ggcaaacgct gaaatggcag attgccaacg 360 
gcattcagca tgtgcgtacc catgtcgatg tttcggatgc aacgctaact gcgctgaaag 420 
caatgctgga agtgaagcag gaagtcgcgc cgtggattga tctgcaaatc gtcgccttcc 4 80 
ctcaggaagg gattttgtcg gatcc 505 

<210> 53 
<211> 27 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: 
oligonucleotide primer 

<400> 53 

gtcaacgtaa ccaaccctgc 20 

<210> 54 
<211> 26 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: 
oligonucleotide primer 

<4.00> 54 

ggatccgaca aaatcccttc ctgagg 26 

<210> 55 
<211> 5674 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: vector 
construct pBluKS-nitP-STLSl-35S-T 

<400> 55 

ccagcttttg ttccctttag tgagggttaa tttcgagctt ggcgtaatca tggtcatagc 60 
tgtttcctgt gtgaaattgt tatccgctca caattccaca caacatacga gccggaagca 120 
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taaagtgtaa agcctggggt gcctaatgag -tgagctaact. cacat-taa-tt gcgttgcgct 180 
cactgcccgc tttccagtcg ggaaacctgt cgtgccagct gcattaatga atcggccaac 240 
gcgcggggag aggcggtttg cgtattgggc gctcttccgc ttcctcgctc actgacbcgc 300 
tgcgctcggt cgt-tcggctg cggcgagcgg tat-cagctca ctcaaaggcg gtaatacggt 360 
tatccacaga atcaggggat aacgcaggaa agaaca-tgtg agcaaaaggc cagcaaaagg 420 
ccaggaaccg taaaaaggcc gcgttgc-tgg cgtttt-tcca taggctccgc ccccctgacg 480 
agcaticaceia. aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat 540 
accaggcgt.'b tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta 600 
ccggatacct gtccgccttt ctcccttcgg gaagcg-bggc gctttctcat agctcacgct 660 
gtaggtatct cagttcggtg taggtcgttc gctccaagct: gggctgtgtg cacgaacccc 720 
ccgttcagcc cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtiaa 780 
gacacgact-b atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg 640 
taggcggtgc -tacagagttc tt.gaagtggt. ggcctaacta cggctacact agaaggacag 900 
tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt 960 
gatccggcaa acaaaccacc gctggtagcg gtggtt-tttt -tgt-ttgcaag cagcagatta 1020 
cgcgcagaaa aaaaggatct caagaagatc ct-ftgatctt. ttctacgggg tctgacgctc 1080 
ag-tggaacga aaactcacgt t.aagggat'tli -tggtcatgag attatcaaaa aggatcttca 1140 
cctagat-cct tttaaa-ttaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa 1200 
cttggtctga cagt:t:accaa tgcttaatca gtigaggcacc tatctcagcg atctgtctat 1260 
t-tcgttcatc catagt-tgcc tgactccccg -tcgtgtagat aactacgata cgggagggct 1320 
'haccatcbgg ccccagtgct gcaatgatac cgcgagaccc acgc-tcaccg gctccagatt 1380 
t-atcagcaa-t aaaccagcca gccggaaggg ccgagcgcag aag-tggtcct gcaacttta-t 1440 
ccgcctccat ccagtctatt aattgt-tgcc gggaagctag agtaagtagt tcgccag-tta. 1500 
atagtttgcg caacgttgtt gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttcr 1560 
gtatggcttc attcagctcc ggttcccaac gatcaaggcg agt.tiacat.ga tcccccatgt 162 0 
tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt tgtcagaagt aagttggccg 168 0 
cagtgttatc actcatggtt atggcagcac tgcataattc tcttactgtc atgccatccg 1740 
taagatgctt ttctgtgact- ggtgagtact caaccaagtc attctgagaa tagtgtatgc 1800 
ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca catagcagaa 1860 
c-tttaaaagt gctcatcatt ggaaaacgt.-t cttcggggcg aaaactctca aggatcttac 19 20 
cgctgttgag atccagttcg atgtaaccca ctcgtgcacc caactgat-ct tcagcatctt 1980 
-ttactt^tcac cagcgtttct gggtgagcaa aaacaggaag gcaaaa^gcc gcaaaaaagg 2040 
gaataagggc gacacggaaa tgttgaatac tcatactctt ccttt-ttcaa tat-tattgaa 2100 
gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata 2160 
aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acct.gacgcg ccctg^agcg 2220 
gcgcattaa.g cgcggcgggt gtggtggtta cgcgcagcgt gaccgctaca c-ttgccagcg 2280 
ccctagcgcc cgctcctttc gctttcttcc cttcctttct cgccacgttc gccggctttc 2340 
cccgtcaagc tctaaatcgg gggctccctt tagggttccg atttagtgct ttacggcacc 2400 
tcgaccccaa aaaacttgair tagggtgatg gttcacgtag tgggccatcg ccctgataga 2460 
cggtttttcg ccctttgacg ttggagtcca cgttctttaa tagtggactc ttgttccaaa 2520 
ct.ggaacaac actcaaccct atctcggtct attcttttga tttataaggg attttgccga 2580 
titi-bcggcc^a ttggttaaaa aatgagctga tttaacaaaa att.taacgcg aattttaaca 2 640 
aaatattaac gcttacaatt tccattcgcc attcaggctg cgcaactgtt gggaagggcg 2700 
atcggtgcgg gcctct-tcgc tattacgcca gctggcgaaa gggggatgtg ctgcaaggcg 2760 
attaagttgg gtaacgccag ggttttccca gtcacgacgt tgtaaaacga cggccagt:ga 2820 
attgtaatac gactcactat. agggcgaatt ggagctcgtc gagaccagat gttttacact 2880 
tgaccgtaaa tgagcacccg aagaaaccgg tcacattcat ttcgaaggtg gagaaagcgg 2940 
aagatgactc aaacaagtaa tcggttgtga ttcgtcagtt catg-tcactc ctatgaagga 3000 
gtcaagttca aaatgtiiatg ttgagttt.ca aacttttatg ctaaactttt tttctttatt 3060 
ttcgttaata atggaagaga accaattctc ttgtatctaa agattatcca tctatcatcc 3120 
aatttgagtg ttcaattctg gatgttgtgt. taccctacat tctacaacca tgtagccaat 3180 
tattatgaat ctggctttga tttcagttgt gttctt-ttct tttttttctt i:gcatatttg 3240 
catt-bagaat glittaataat taagttactg tatttccaca tacattagtt ccaagaatat 3300 
acatatatta at-ttattttt cttaaaaat-g ttttggaatg actaatattg acaacgaaaa 3360 
-tagaagcta^ gctaaaccat tacgta^at.g ligacttcaca tgttgttgtt ttacattccc 34 20 
tatatatatg gatggctgtc acaatcagaa acgtgatcga aaaaagacaa acagtgtttg 3480 
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cataaaaaga ctatttcgtt tcattgacaa tttgtgttta tttgtaaaga aaagtggcaa 3540 

agtggaattt gagttcctgc aagtaagaaa gatgaaataa aagacttgag tgtgtgtttt 3600 

tttcttttat ctgaaagctg caatgaaata ttcctaccaa gcccgtttga ttattaattg 3660 

gggtttggtt ttcttgatgc gaactaattg gttatataag aaactataca atccatgtta 3720 

attcaaaaat tttgatttct cttgtaggaa tatgatttac tatatgagac tttcttttcg 3780 

ccaataatag taaatccaaa gatatttgac cggaccaaaa cacattgatc tattttttag 3840 

tttatttaat ccagtttctc tgagataatt cattaaggaa aacttagtat t:aacccat:cc 3900 

taagattaaa taggagccaa actcacattt caaatattaa ataaca1;aaa atggatttiaa 3960 

aaaatctata cgtcaaattt tatttatgac atttcttatt taaatttata tttaatgaaa 4020 

^acagc-taag acaaaccaaa aaaaaaatac tttctaagtg gtccaaaaca teaattccgt 4080 

tcaatattat taggtagaat cgtacgacca aaaaaagg^a ggttaat.acg aattagaaac 4140 

at-atictataa caliagtatiat: attattacct attatgagga atcaaaat:gc atcaaatatg 4 200 

gat.'t'baagga a-tccataaaa gaa-taaa^-tc -tacgggaaaa aaaatiggaat. aaattctttt 4 260 

aagtt-tttta tttgtttttt atttggtagt tctccatttt gttttatttc gtttggattt 4320 

attgtgtcca aatactttgt aaaccaccgt tgtaattctt aaacggggt-t t-tcacttctt 4360 

ttttatattc agacataaag catcggctgg -tttaatcaat caatagattt. tatttttctt 444 0 

ctcaat-tatt agtaggtttg atgtgaactt tacaaaaaaa acaaaaacaa a-tcaatgcag 4500 

agaaaagaaa ccacgtgggc tagtcccacc ttgtttcatt ^ccaccacag gttcgatctt 4560 

cgttaccgtc t^ccaatagga aaataaacgt gaccacaaaa aaaaaacaaa aaaaagtctia 4620 

tatattgc-tt ctctcaagtc tctgagtgtc a-tgaaccaaa g-taaaaaaca aagactcgac 4680 

ctgcaggcat gcaagcttat cgtcgactac gtaagtttct gcttctacct tt-gatatata 4740 

tataataatt atcattaatit agtagtaata -taatatttca aatatttttt tcaaaataaa 4800 

agaat-gtagt: atatagcaat: tgcttttctg tagtttataa gtgtgt:atat: tttaatttat 4 860 

aac^^t-tc-ta a'tat^atgacc aaaatti:gi:t: gatgtgcagg tatcaccgga tccatcgaat 4920 

tcggtacgct gaaatcacca gtctctctct acaaatctat: ctctctctat tttctccata 4980 

aataatgtgt gagtagtttc ccgataaggg gaanttaggg ttcttatagg gtttc.gctca 504 0 

tgtgttgagc atataagaaa cccttagtat. gtatttgtat ttgtaaaata cttctatcaa 5100 

taaaatttct aattcct.aaa accaaaatcc agtactaaaa tccagatctc ctaaagtccc 5160 

tatagatctt tgtcgtgaat ataaaccaga cacgagacga ctaaacctgg agcccagacg 5220 

ccgttcgaag ctagaagtac cgcttaggca ggaggccgtt agggaaaaga t.gc^aaggca 5280 

gggttggtta cgttgactcc cccgtaggt-t tggtttaaat atgatgaagt ggacggaagg 5340 

aaggaggaag acaaggaagg a1iaagg1:t.gc aggccctgtg caaggtaaga agatggaaati 5400 

ttgatagagg tacgctacta tacttatact al:acgctaag ggaat:gctt:g t-att-ba-bacc 5460 

ctataccccc t:aa^aacccc tta-bcaa^tit. aagaaa-taat ccgca-taagc ccccgc^taa 5520 

aaattgg^a^ cagagccatig aataggt;ct:a -bgaccaaaac tcaagaggat aaaacclicac 5580 

caaaatacga aagagttctt aactctaaag at^aaaagatc tttcaagatc aaaacbagtt: 5640 

ccctcacacc ggtgacgggg atcgcgat^gg g1:ac 5674 

<210> 56 
<211> 6046 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: binary 
vector pSUNl 

<400> 56 

ttccatggac atacaaatgg acgaacggat aaaccttttc acgccctttt aaatatccga 60 

ttattctaat aaacgctctt ttctcttagg tttacccgcc aatatatcct gtcaaacact 120 

gatagtttaa actgaaggcg ggaaacgaca atcagatcta gtaggaaaca gctatgacca 180 

tgattacgcc aagcttgcat gcctgcaggt cgactctaga ctagtggatc cgatatcgcc 240 

cgggctcgag gtaccgagct cgaattcact ggccgtcgtt ttacaacgac tcagctgctt 300 

ggtaataatt gtcattagat tgtttttatg catagatgca ctcgaaatca gccaatttta 360 

gacaagtatc aaacggatgt taattcagta cattaaagac gtccgcaatg tgttattaag 420 

ttgtctaagc gtcaatttgt ttacaccaca atatatcctg ccaccagcca gccaacagct 480 

ccccgaccgg cagctcggca caaaatcacc acgcgttacc accacgccgg ccggccgcat 540 

ggtgttgacc gtgttcgccg gcattgccga gttcgagcgt tccctaatca tcgaccgcac 600 
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ccggagcggg cgcgaggccg ccaaggcccg aggcgt.gaag tttggccccc gccctaccct 660 
caccccggca cagatcgcgc acgcccgcga gctgatcgac caggaaggcc gcaccg-tgaa 720 
agaggcggct gcactgcttg gcgt.gcatcg ctcgaccctg taccgcgcac ttgagcgcag 780 
cgaggaag'bg acgcccaccg aggccaggcg gcgcggt.gcc ttccgtgagg acgcattgac 84 0 
cgaggccgac gccc-tggcgg ccgccgagaa -bgaacgccaa gaggaacaag caligaaaccg 900 
caccaggacg gccaggacga accg-tt-btiiic a-b-baccgaag agatrcgaggc ggagatgat:c 960 
gcggccgggt acgtgttcga gccgcccgcg cacgtctcaa ccgtgcggct gcatgaaatc 10 2 0 
ctggccggtt tgtctgatgc caagctggcg gcctggccgg ccagctiliggc cgctgaagaa 1080 
accgagcgcc gccgtctaaa aaggtgatgt gtatttgagt aaaacagctl: gcg-tcatigcg 1140 
g'tcgc'tgcg't a-tatga'bgcg a-bgag-baaa-b aaacaaatac gcaaggggaa cgca-hgaagg 1200 
ttatcgctgt acbtaaccag aaaggcgggt caggcaagac gaccabcgca acccatctag 1260 
cccgcgccct gcaactcgcc ggggccgatg ttctgttagt cga-b-bccgat: ccccagggca 1320 
gtgcccgcga ttgggcggcc gtgcgggaag atcaaccgct aaccgttgtc ggcat-cgacc 1380 
gcccgacgat tgaccgcgac gtgaaggcca tcggccggcg cgacttcgta gtgat-cgacg 1440 
gagcgcccca ggcggcggac ttggctgtgt ccgcgatcaa ggcagccgac tt.cg±.gct.ga 1500 
ttccggtgca gccaagccct tacgacatat gggccaccgc cgacctgg^g gagctggtta 1560 
agcagcgcat. tgagg-bcacg ga-tggaaggc tiacaagcggc cti'kligticg'bg "tcgcgggcga 1620 
-tcaaaggcac gcgcaiicggc ggtgaggttg ccgaggcgct ggccgggtac gagctgccca 1680 
ttcttgag-tc ccg1:atcacg cagcgcgtga gctacccagg cac1;gccgcc gccggcacaa 1740 
ccgttcttga atcagaaccc gagggcgacg ctgcccgcga ggticcaggcg ctggcegctg 1800 
aaat:taaa'tc aaaactcabt tgagt^taatg aggtaaagag aaaa-tgagca aaagcacaaa 1860 
cacgctaag-t gccggccgtc cgagcgcacg cagcagcaag gcbgcaacgt tggccagcct 1920 
ggcagacacg ccagccatga agcgggtcaa cfttcagttg ccggcggagg at^cacaccaa 1980 
gcbgaagabg tacgcggtac gccaaggcaa gaccattacc gagctgctat ct.gaat.aca^ 2040 
cgcgcagcta ccagagtaaa tgagcaaabg aat.aaatgag -kaga-tgaa-bt. ttagcggcta 2100 
aaggaggcgg catggaaaat caagaacaac caggcaccga cgccgtggaa tgccccatgt 2160 
gbggaggaac gggcggttgg ccaggcgtaa gcggcbgggt tgtctgccgg ccctgcaatg 22 20 
gcactiggaac ccccaagccc gaggaa-bcgg cgt.gagcggt cgcaaaccat. ccggcccggt 2280 
acaaa-bcggc gcggcgctgg gtgatgacct ggtggagaag t:tgaaggccg cgcaggccgc 2340 
ccagcggcaa cgca-bcgagg cagaagcacg ccccgg-bgaa -bcgbggcaag cg-gccgcbga 2400 
-bcgaat.ccgc aaagaa-bccc ggcaaccgcc ggcagccgg-b gcgccgbcga t.t.aggaagcc 2460 
gcccaagggc gacgagcaac cagatttttt cgttccgatg cbc-batigacg t:gggcacccg 2520 
cgatagtcgc agcatcatgg acgtggccgt tttccgtctg tcgaagcgt:g accgacgagc 2580 
tggcgaggtg atccgctacg agcbtccaga cgggcacgta gaggtttccg cagggccggc 2640 
cggcatggcc agtgtgtggg attacgacct ggtactgatg gcggtttccc aticbaaccga 2700 
atccatgaac cgataccggg aagggaaggg agacaagccc ggccgcg-tgt tccgtccaca 27 60 
cgttgcggac gtactcaagt tctgccggcg agccgatggc ggaaagcaga aagacgacct 2820 
ggtagaaacc tgcattcggt baaacaccac gcacgttgcc atgcagcgta cgaagaaggc 2880 
caagaacggc cgcct.ggt.ga cggtatccga gggtgaagcc ttgattagcc gctacaagat 294 0 
cgtaaagagc gaaaccgggc ggccggagta catcgagatc gagctagctg attggatgta 3 000 
ccgcgagatc acagaaggca agaacccgga cgtgctgacg gttcaccccg attacttttt 3060 
gatcgatccc ggca-bcggcc gtt-ttctcta ccgcctggca cgccgcgccg caggcaaggc 3120 
agaagccaga tggt,^g-ttca agacgatcta cgaacgcagt ggcagcgccg gagagt^bcaa 3180 
gaagttctgt ttcaccgtgc gcaagctgat cgggtcaaat gacctgccgg agtacgattt 3240 
gaaggaggag gcggggcagg ctggcccga'b cctagtcatg cgctaccgca acctgatcga 3300 
gggcgaagca tccgccggtt cctaatgtac ggagcagatg ctagggcaaa ttgccctagc 3360 
aggggaaaaa ggtcgaaaag gtctctttcc tgtggatagc acgtacattg ggaacccaaa 3420 
gccgtacabt gggaaccgga acccgtacat tgggaaccca aagccgtaca ttgggaaccg 3480 
gtcacacatg taagtgactg atataaaaga gaaaaaaggc gatttttccg cctaaaactc 3540 
tttaaaactt: attaaaactc ttaaaacccg cctggcctgt gcataactgt ctggccagcg 3600 
cacagccgaa gagctgcaaa aagcgcctac cct.t.cgg'bcg ctgcgctccc t.acgccccgc 3660 
cgcttcgcgt cggcctatcg cggccgctgg ccgc-tcaaaa atggctggcc tacggccagg 3720 
caatctacca gggcgcggac aagccgcgcc gtcgccactc gaccgccggc gcccaca-tca 3780 
aggcaccctg cctcgcgcgt ttcggtgatg acgg'tgaaaa cctctgacac at.gcagct.cc 3840 
cggagacggt cacagcttgt ctgtaagcgg a^gccgggag cagacaagcc cgtcagggcg 3900 
cgt.cagcggg tgttggcggg tgtcggggcg cagccatgac ccagi:cacgb agcgat.agcg 3960 
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gagtgtatac tggcttaact a-tgcggcatc agagcagatt gtactgagag tgcaccatat 4 020 
gcggtgtgaa ataccgcaca gatgcgtaag gagaaaaiiac cgcatcaggc gctcttccgc 4080 
ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg ta'bcagc^ca 4140 
cbcaaaggcg g^aatacggt: t.a'tccacaga atcaggggat aacgcaggaa agaacatgtg 4200 
agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgt-ttttcca 4260 
taggctccgc ccccct-gacg agcatcacaa aaatcgacgc -tcaagtcaga gg-tggcgaaa 4320 
cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc 4 380 
tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc 4440 
gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc gct^ccaagct 4500 
gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg 4560 
-bcttigag^cc aacccgg-taa gacacgactt: a-bcgccactig gcagcagcca ct^ggtiaacag 4620 
gatliagcaga gcgaggt:at.g taggcggtgc "tacagagtitc 'b-tgaagt:gg'b ggcc-taacta 4680 
cggctacac-k agaaggacag tatttggta-b ctgcgct:ct:g ctgaagccag ttaccttcgg 4740 
aaaaagagbt: ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt 4800 
tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt 4860 
t-tctacgggg tctgacgctc agtggaacga aaac-bcacgt taagggat-tt tggtcatgca 4 920 
tgatatatct. cccaatttgt gtagggctta ttatgcacgc ttaaaaataa taaaagcaga 4980 
cttgacctga tiagttt:ggct gtgagcaatt atgtgc-ttag tgcatctaac gcttgagtta 5040 
agccgcgccg cgaagcggcg tcggcttgaa cgaatttcta gc-tagacatt atttgccgac 5100 
taccttggtg atctcgcctt tcacgtagtg gacaaa-ktct tccaact;gat: ctgcgcgcga 5160 
ggccaagcga tcttcttctt gtccaaga-ta agcc^g1:c^a gcttcaag-ba ^gacgggctg 5220 
a-tactgggcc ggcaggcgci: ccattgccca gtcggcagcg acatcct.tcg gcgcgatttt 5280 
gccggtitact: gcgcbgtiacc aaatigcggga caacg-taagc actaca-tt:-tc gc-kcat-cgcc 5340 
agcccagtcg ggcggcgag^ tccatagcg^ iiaagg^'b-bca tttagcgcct caaatagatc 5400 
c-bgiitcagga accggalicaa agagttcctc cgccgctgga cctaccaagg caacgct^a-hg 5460 
•ttctcttgct tttgtcagca agatagccag atcaat:gt.cg atcgtggctg gctcgaagat 5520 
acctgcaaga atgtcattgc gctgccattc tccaaattgc agttcgcgct tagctggata 5580 
acgccacgga atgatgtcgt cgtgcacaac aatggtgact tctacagcgc ggagaatctc 5640 
gctctctcca ggggaagccg aagtttccaa aaggtcgt-tg atcaaagctc gccgcgttgt 5700 
ttcatcaagc cttacggtca ccgtaaccag caaatcaata tcactgtgtg gcttcaggcc 5760 
gccatccact gcggagccgt acaaatgtac ggccagcaac gtcggtt-cga gatggcgctc 5820 
gatgacgcca actacctctg atagttgagt cgatacttcg gcgatcaccg cttcccccat 5880 
gatgtttaac tttgttttag ggcgactgcc ctgctgcgiia acatcgt:t.gc tgcticcataa 5940 
ca-tcaaacat cgacccacgg cg-taacgcgc ttgctgcttg gatgcccgag gcatagactg 6000 
taccccaaaa aaacag'tca-t aacaagccat. gaaaaccgcc actigcg 604 6 

<210> 57 
<211> 9838 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> DeBcription of the artificial sequence: Transgenic 
expression vector for codA dsRNA pSXn^l-codA-RNAi 

<400> 57 

cgaattcact ggccgtcgtt ttacaacgac tcagctgctt ggtaataatt gtcattagat 60 
tgtttttatg catagatgca ctcgaaatca gccaatttta gacaagtatc aaacggatgt 120 
taattcagta cattaaagac gtccgcaatg tgttattaag ttgtctaagc gtcaatttgt 180 
ttacaccaca atatatcctg ccaccagcca gccaacagct ccccgaccgg cagctcggca 24 0 
caaaatcacc acgcgttacc accacgccgg ccggccgcat ggtgttgacc gtgttcgccg 300 
gcattgccga gttcgagcgt tccctaatca tcgaccgcac ccggagcggg cgcgaggccg 360 
ccaaggcccg aggcgtgaag tttggccccc gccctaccct caccccggca cagatcgcgc 420 
acgcccgcga gctgatcgac caggaaggcc gcaccgtgaa agaggcggct gcactgcttg 4 80 
gcgtgcatcg ctcgaccctg taccgcgcac ttgagcgcag cgaggaagtg acgcccaccg 540 
aggccaggcg gcgcggtgcc ttccgtgagg acgcattgac cgaggccgac gccctggcgg 600 
ccgccgagaa tgaacgccaa gaggaacaag catgaaaccg caccaggacg gccaggacga 660 
accgtttttc attaccgaag agatcgaggc ggagatgatc gcggccgggt acgtgttcga 7 20 
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gccgcccgcg cacgtctcaa ccg-tgcggct 
caagctggcg gcctggccgg ccagcttggc 
aaggtigatgt: gtiatt-tgag-t aaaacagc-tt 
at.gag-baaat aaacaaatac gcaaggggaa 
aaaggcgggt caggcaagac gaccatcgca 
ggggccgatg ttctgttagt cgattccgat 
gtgcgggaag ateaaccgc^ aaccgttgtc 
gtgaaggcca tcggccggcg cgacttcgta 
ttggctgtgt ccgcgatcaa ggcagccgac 
t-acgacatat gggccaccgc cgacctggtg 
ga-tggaaggc tacaagcggc ctttgtcgtg 
ggtgaggttg ccgaggcgct ggccgggtac 
cagcgcgtga gct^acccagg cact^gccgec 
gagggcgacg ct.gcccgcga ggt-ccaggcg 
^gag'tt.aa'bg aggt.aaagag aaaat-gagca 
cgagcgcacg cagcagcaag gct^gcaacgt 
agcggg^caa c^^t^cag-btg ccggcggagg 
gccaaggcaa gacca'bt.acc gagc-tgcta-t 
tgagcaaatg aa-taaa-tgag tagatgaatt 
caagaacaac caggcaccga cgccgtggaa 
ccaggcgtaa gcggctgggt tgtctgccgg 
gaggaat^cgg cgtgagcggt cgcaaaccat: 
g-bgatgacct. ggiiggagaag t;1^gaaggccg 
cagaagcacg ccccgg^gaa 1;cgt.ggcaag 
ggcaaccgcc ggcagccggl: gcgccgt:cga 
caga-ttttt-t cgt-tccgatg ctct-a-tgacg 
acg'bggccg^ 'bt.t.ccglic'kg -tcgaagcg-bg 
agcttccaga cgggcacg1:a gaggt.'b-tccg 
a-b-bacgacct ggtactgatg gcggtt-tccc 
aagggaaggg agacaagccc ggccgcgtgt 
tctgccggcg agccgatggc ggaaagcaga 
-baaacaccac gcacgttgcc atgcagcgta 
cggtatccga gggtgaagcc ttgattagcc 
ggccggag'ta catcgagatc gagctagc-tg 
agaacccgga cg-tgct.gacg gtitcaccccg 
g-t't'b'tc'bct.a ccgcctggca cgccgcgccg 
agacga-bc-ba cgaacgcagli ggcagcgccg 
gcaagctgab cgggtcaaa-b gacctgccgg 
c^ggcccga't cctagtcatg cgctaccgca 
cctaatgtac ggagcagat.g ct.agggcaaa 
gtctctttcc bg'kggat.agc acgtacattg 
acccgtacat tgggaaccca aagccgtaca 
at.at.aaaaga gaaaaaaggc gattttt-ccg 
'ttaaaacccg cctggcctg-t gcataactgt 
aagcgccbac ect.'kcggt.cg c-tgcgcbccc 
cggccgctgg ccgctcaaaa at:ggct.ggcc 
aagccgcgcc gtcgccacbc gaccgccggc 
ttcggtgatg acggtgaaaa cctctgacac 
C'tg'kaagcgg atgccgggag cagacaagcc 
tgtcggggcg cagccatgac ccagtcacgt 
atgcggcatc agagcagatt gtactgagag 
gatgcgtaag gagaaaa-tac cgcatcaggc 
tgcgctcggt: cgttcggctg cggcgagcgg 
t-atccacaga atcaggggat aacgcaggaa 
ccaggaaccg taaaaaggcc gcgttgctgg 
agcabcacaa aaatcgacgc tcaag'tcaga 



78 

gcatgaaa-tc ctggccggtt: -tg-tctgatgc 780 
cgctgaagaa accgagcgcc gccgtctaaa 84 0 
gcgtica-tgcg gtcgctgcgt atatgatgcg 900 
cgcatgaagg ttatcgcbgt; acttaaccag 960 
acccatctag cccgcgccct gcaactcgcc 1020 
ccccagggca gtgcccgcga ttgggcggcc 1D80 
ggca-tcgacc gcccgacgat tgaccgcgac 114 0 
gtgatcgacg gagcgcccca ggcggcggac 1200 
ttcgtgctga ttccggtgca gccaagccct 12 60 
gagctggtta agcagcgcaU tgaggtcacg 1320 
t.cgcgggcga tcaaaggcac gcgca-tcggc 13B0 
gagc-bgccca ttcttgagtc ccg-tat:cacg 1440 
gccggcacaa ccgb-tcttga at;.cagaaccc 1500 
ctggccgctg aaabliaaabc aaaacbca-tt: 1560 
aaagcacaaa cacgcbaagl:. gccggccgbc 1620 
bggccagcct: ggcagacacg ccagccatga 1680 
abcacaccaa gctgaagatg tacgcggtac 174 0 
ctgaabacat. cgcgcagcta ccagagt.aaa 1800 
ttagcggcta aaggaggcgg catggaaaa-t 1860 
tgccccatgt gtggaggaac gggcggttgg 1920 
ccctgcaatg gcactggaac ccccaagccc 1980 
ccggcccggt acaaat^cggc gcggcgctgg 2040 
cgcaggccgc ccagcggcaa cgca-tcgagg 2100 
cggccgctga tcgaatccgc aaagaat.ccc 2160 
-t-taggaagcc gcccaagggc gacgagcaac 2220 
-tgggcacccg cgatagt:cgc agcatcatgg^ 2280 
accgacgagc -tggcgaggtg atccgctacg 2340 
cagggccggc cggcabggcc agtgtgtggg 2400 
a-tcbaaccga a-tccabgaac cgataccggg 2460 
tccgtccaca cgttgcggac gtactcaag-t 2520 
aagacgacct ggtagaaacc tgcattcggt 258 0 
cgaagaaggc caagaacggc cgccbgg'bga 2640 
gctacaaga-t cgtaaagagc gaaaccgggc 27 00 
at.t:ggatgt.a ccgcgaga-bc acagaaggca 2760 
a't'tact.t:t>'tt. ga-tcgat.ccc ggcat-cggcc 2820 
caggcaaggc agaagccaga tggttg-tbca 2880 
gagagt.t.caa gaagttctg-t ttcaccgfcgc 2940 
agtacgattt gaaggaggag gcggggcagg 3000 
acctgatcga gggcgaagca tccgccggtt, 3060 
-ttgccctagc aggggaaaaa ggtcgaaaag 312 0 
ggaacccaaa gccgtacatt gggaaccgga 3180 
ttgggaaccg gtcacacatg taagtgactg 324 0 
cctaaaactc tttaaaac-tt a-btaaaactc 3300 
cbggccagcg cacagccgaa gagctgcaaa 3360 
t.acgccccgc cgcttcgcgt cggcc'tat.cg 3420 
-tacggccagg caatctacca gggcgcggac 3480 
gcccacabca aggcaccctg cctcgcgcgt 3540 
atgcagctcc cggagacggt cacagcttgt 3 600 
cgtcagggcg cgtcagcggg tgttggcggg 3 660 
agcgatagcg gagtgtatac t^ggcttaact 3720 
tgcaccatat: gcggtgtgaa ataccgcaca 3 78 0 
gctcttccgc ttcctcgctc actgactcgc 384 0 
tatcagctca ctcaaaggcg g-taatacggl:^ 3 900 
agaacat.gt;g agcaaaaggc cagcaaaagg 3960 
cgtttttcca taggciiccgc ccccctgacg 4 020 
ggtggcgaaa cccgacagga cbabaaagat. 4 080 
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accaggcgt.'t tccccct-gga agctccctcg tgcgctctcc t:gt;i:ccgacc ctgccgct-ba 4140 
ccggatacct; gtccgccttt ctccct-tcgg gaagcg^ggc gctttctcat agctcacgct 4200 
g-taggtatct: cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc 4260 
ccgttcagcc cgaccgctgc gccttatccg gt.aact.atcg t.c1:"tgagt:cc aacccggtaa 4320 
gacacgactt at.cgccact.g gcagcagcca ctggtaacag gat.t.agcaga gcgaggtatg 4380 
taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact agaaggacag 444 0 
tatt-tggtat. ctgcgctctg c-tgaagccag ttaccttcgg aaaaagag-bl: ggtagctctt. 4500 
gatccggcaa acaaaccacc gc-tggtagcg gt-ggtttttt. tgtttgcaag cagcagat.'ba 4560 
cgcgcagaaa aaaaggat^ct caagaagatc ctttgatciii: ttctacgggg tctgacgctc 4620 
agt.ggaacga aaactcacgt taagggatt-b tggtcatgca tgata-ta-tct cccaatitt.g'k 4680 
gt-agggctta t.t:atgcacgc t.t.aaaaa-taa 'taaaagcaga ct.t.gacctga 'bagt'ttiggc't 4740 
g-bgagcaali-b at.gtgc'kiiag iLgcatctaac gctt-gagtlia agccgcgccg cgaagcggcg 4 800 
tcggcttgaa cgaatttcta gctagacatt a'bt.t.gccgac tacct-tggtg atctcgcctt 48 60 
tcacgtagtg gacaaattct tccaactgat ctgcgcgcga ggccaagcga tcttcttctt 4920 
gtccaagata agcctgtcta gct.tcaagt.a tgacgggctg atactgggcc ggcaggcgct 4980 
ccattgccca gtcggcagcg acatcciitcg gcgcgatttt gccggttact. gcgctgtacc 5040 
aaat.gcggga caacg-taagc actacat.tt.c gctcatcgcc agcccagtcg ggcggcgagt 5100 
tccatagcgt taagg-tttca tttagcgcc-t caaatagatLC ctgttcagga accggatcaa 5160 
agagt.tcctc cgccgctgga cctaccaagg caacgcta-tg ttctcttgct t-b-bgtcagca 5220 
agatagccag atcaat:gt:cg atcgtggct:g gctcgaagat acctgcaaga atgtcattgc 5280 
gctgcca-ttc tccaaat.tgc agttLcgcgcb tagctgga-ta acgccacgga atgatgtcgt 5340 
cgtgcacaac aat^gg-tgact t-cbacagcgc ggagaatctc gctctcticca ggggaagccg 5400 
aagtttccaa aaggtcg-blig atcaaagctc gccgcg-ttgt -tt:catcaagc ct.i;acggt:ca 5460 
ccgt.aaccag caaat.caat.a tcactgtgtg gcttcaggcc gccai:ccac-b gcggagccgt: 5520 
acaaatg-tac ggccagcaac gtcggt^tcga gatggcgctc gatgacgcca actacctctg 5580 
atagttgagt cgatacttcg gcgatcaccg cttcccccat gatgtttaac tttgttttag 56 4 0 
ggcgactgcc ctgctgcgta acatcgttgc tgctccataa catcaaacat cgacccacgg 5700 
cgtaacgcgc ttgctgcttg gatgcccgag gcatagactg taccccaaaa aaacagtcat 5760 
aacaagccat gaaaaccgcc act.gcgttcc a'tggacat.ac aaat.ggacga acggat.aaac 5820 
cttttcacgc cctttt.aaat atccga-ttat: -tctaat^aaac gctcttttct cttaggttta 5880 
cccgccaa'ta tatectgtca aacactga^ta g-ttt^aaactg aaggcgggaa acgacaatca 594 0 
gatctagt:ag gaaacagcba tgaccatgai: tacgccaagc t^-bgcatgcct gcaggtcgac 6000 
tctagactag tggatccga'b atcgcccggg ctcgaggtac ccat^cgcgat ccccgtcacc 6060 
ggtgtgaggg aactagtttt gatcttgaaa gatcttttat cbttagagtt aagaactctt 6120 
tcgtatt-ttg gtgaggtttt atcctcttga gttttggtca tagacctatt catggctctg 6180 
ataccaattt ttaagcgggg gcttatgcgg at;tat.tt.ct.t. aaattgataa ggggttatta 62 4 0 
g99g9tat:ag ggtat:aaat:a caagcattcc cttagcgtat agtat-aagta tagtagcgta 6300 
cctctatcaa atttccatct tcttaccttg cacagggcct gcaaccttat ccttccttgt 63 60 
c't'tcctcct^'t cc'tt.ccg't.cc act.'tcat.ca't. a'tt'taaacca aacct:acggg ggag'tcaacg 64 20 
taaccaaccc tgccttagca tcttttccct aacggcctcc tgcctaagcg gt^acftctag 64 80 
cttcgaacgg cgtctgggct ccaggtttag tcgtctcglig tctggtttat at.tcacgaca 6540 
aagatc-tatia gggactti-fcag gagatctgga ttttagtact ggattttggt t;t-taggaatt 6600 
agaaatttta ttgatagaag tattttacaa atacaaat;ac atactaaggg ^1:^1:^8^31: 6660 
gctcaacaca tgagcgaaac cctataagaa ccctaant.tc cccttatcgg gaaactactc 6720 
acaca ttatt tatggagaaa atagagagag atagattt.g't agagagagac tggtgatttc 6780 
agcgtaccga attcggctaa cagtgtcgaa taacgcttta caaacaa1:ta ttaacgcccg 684 0 
gttaccaggc gaagaggggc tgtggcagat tcatctgcag gacggaaaaa tcagcgccat 6900 
tgatgcgcaa tccggcgtga tgcccataac t.gaaaacagc ctggatgccg aacaaggttt: 6960 
agttataccg ccgtttgtgg agccacatat tcacctggac accacgcaaa ccgccggaca 7020 
accgaactgg aatcagtccg gcacgctgtt tgaaggcatt gaacgctggg ccgagcgcaa 7 08 0 
agcgtt:at:t.a acccatgacg atgtgaaaca acgcgcatgg caaacgctga aatggcagat 714 0 
tgccaacggc attcagcatg tgcgtaccca tgtcgatgtt t;cggatgcaa cgctaactgc 7200 
gctgaaagca atgctggaag t.gaagcagga agtcgcgccg tggattga-tc tgcaaatcgt 7260 
cgccttccct caggaaggga ttttgtcgga tccggt:ga<fca cctgcacatc aacaaatttt 7320 
ggtcat.a'bat tiagaaaagtl: a'taaati:aaa a-batacacac t.t:ataaacta cagaaaagca 7 380 
at:t.gctata1: actacat:tct: tt.tattttga aaaaaatal:^ tgaaatatta tatit^actact 7440 
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aat-taatgat aattattata tatatatcaa aggtagaagc agaaacttac gtagtcgacg 7500 
acaaaaliccc ttcctgaggg aaggcgacga tttgcagatc aat.ccacggc gcgacttcct 7560 
gcttcacttc cagcattgct ttcagcgcag t-tagcg-ttgc at.ccgaaaca tcgacatggg 7620 
tacgcacatg ctgaatgccg ttggcaatct gccattt.cag cgtt.t.gccat gcgcgt.tgt-t 7680 
tcacatcgtc atgggttaat aacgctttgc gctcggccca gcgt^-tcaa^g cctlicaaaca 7740 
gcgtgccgga ctgattccag ttcggttgtc cggcggtttg cgtggtgtcc aggt.gaa-ta-t 7800 
g-tggctccac aaacggcgg-b ataactaaac cbtg-ttcggc a^ccaggctg 'btt.t.cag't-ta 7860 
-bgggca'bcac gccggattgc gcatcaatgg cgctgatttt tccg-tcctgc agatgaatct 7920 
gccacagccc ctcttcgcct gg-taaccggg cgt'taa-baa't: tgttt-gtaaa gcg-ttattcg 7980 
acactgttag ccaagcttgc atgcctgcag gtcgagtctt tgttttttac tttgg-ttcat 8040 
gacactcaga gacttgagag aagcaat.at.a tagacttttt "tttgttttt.'t tttt-g-tggtc 8100 
acg-tttattt -tcctattgga gacggtaacg aagatcgaac ctgtgg-tgga aat.gaaacaa 8160 
ggtgggacta gcccacgt:gg tttcttttct ctgcattgat ttgtttttgt tttttttgta 8220 
aag-btcacat: caaacc^act aai:aatt.gag aagaaaaata aaatc^a'ttg attgatt^aaa 8280 
ccagccga-bg ct.'t'tatg'tct. gaatat.aaaa aagaag-tgaa aaccccg-b^t. aagaat.t.aca 8340 
acggt.gg'tti'b acaaagt.at.t, -bggacacaah aaa-tccaaac gaaa-taaaac aaaatggaga 8400 
ac'taccaaa't aaaaaacaaa t-aaaaaactt: aaaagaa-t-t-t a-tt.cca-tt.'t'b 'tt:'t'tcccgfc.a 8460 
gaatt-tattc -ttttatggat tecttaaa^c catattitgal: gcattttgat -tcctcataat: 8520 
aggt-aa-baa-b at.a'bactiat^g ttatagatat gtttctaatt cgtattaacc tacctttttt 8580 
tggtcgtacg attctaccta abaat.at.'bga acggaattga tgttttggac cactbagaaa 864 0 
gtattttttt tttggtttgt cttagctgta tttcattaaa tataaatt-ta aabaagaaab 8700 
gtcabaaata aaatttgacg tatagatttt ttaaatccat tttatgttat ttaatat-t-tg 87 60 
aaat-gbgagt. t-tggctccta tt-taatctta ggatgggtta atactaag-tt ttccttaatg 6820 
aattatctca gagaaactgg at-taaataaa ctaaaaaata gatcaatgtg tttbggtccg B880 
gtcaaatatc tttggattta ctat-tatt-gg cgaaaagaaa g-bct,cat.at,a gt.aaabca'ta 8940 
-k-tcctacaag agaaa1:caaa ab-tt-ttgaat: -taacatiggat: bg-tatagftt; cb-kata-taac 9000 
caa^t.ag'b'tc gcabcaagaa aaccaaaccc caat,baat.aa t.caaacgggc -tb-gg-baggaa 9060 
tatttcattg cagctttcag al^aaaagaaa aaaacacaca ctcaagtci^t ttatttcatc 9120 
ti-btiC'bt.ac-b't gcaggaac-bc aaa'bbccacrb -t-tgccac-bt^t. t.c-b'bt.acaaa -taaacacaaa 9180 
t-tgtcaatga aacgaaatag tctttttatg caaacactgt ttgtcttttt tcgatcacgt 924 0 
ttctgattgt gacagcca1:c cat:a'katat.a gggaatig'baa aacaacaaca tgtgaagtca 9300 
catatacg-ta atggtttagc atagcttcta ttttcgttgt caatattagt cattccaaaa 9360 
catttttaag aaaaataaat taatatat-gt. ata-tbcttgg aactaatgta t:gtggaaat.a 94 2 0 
cag-taact-ta atbat-baaac ab-kctaaabg caaababgca aagaaaaaaa agaaaagaac 94 80 
acaactgaaa tcaaagccag aftcataata atbggctaca t-ggttgtaga at.gt:agggba 9540 
acacaaca-bc cagaabbgaa cacbcaaat.b. ggatgataga t.ggat;aatct t.-bagabacaa 9600 
gagaabbgg-b tctcttccat tattaacgaa aataaagaaa aaaagbbtag caiiaaaagtb 9660 
bgaaacbcaa cataacabbb tgaactbgac tccbtca'bag gagbgacabg aacbgacgaa 9720 
tcacaaccga ttacttgbtt gagbcatctt ccgctttctc caccb-bcgaa at:gaatgt:ga 97 BO 
ccggtttctt cgggtgctca ttt.acggt.ca agtgtaaaac atctggtctc gacgagct 9838 

<210> 58 
<211> 14184 
<212> DNA 

<213> Arti.£ic±al sequence 
<220> 

<223> Description of the artificial sequence: Expression 
vector pSUNl-codA-KNAi-At.Act.-2-At.Als-R-ocsT 

<400> 58 

ctgcttggta ataattgtca ttagattgtt tttatgcata gatgcactcg aaatcagcca 60 
attttagaca agtatcaaac ggatgttaat tcagtacatt aaagacgtcc gcaatgtgtt 120 
attaagttgt ctaagcgtca atttgtttac accacaatat atcctgccac cagccagcca 180 
acagctcccc gaccggcagc tcggcacaaa atcaccacgc gttaccacca cgccggccgg 240 
ccgcatggtg ttgaccgtgt tcgccggcat tgccgagttc gagcgttccc taatcatcga 300 
ccgcacccgg agcgggcgcg aggccgccaa ggcccgaggc gtgaagtttg gcccccgccc 360 
taccctcacc ccggcacaga tcgcgcacgc ccgcgagctg atcgaccagg aaggccgcac 420 
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cgtgaaagag gcggctgcac tgc-ttiggcgt gcatcgctcg accctgtacc gcgcacttga 480 
gcgcagcgag gaagtgacgc ccaccgaggc caggcggcgc ggtgccttcc gt^gaggacgc 54 0 
attgaccgag gccgacgccc tggcggccgc cgagaatgaa cgccaagagg aacaagcatg 600 
aaaccgcacc aggacggcca ggacgaaccg tttttcatta ccgaagagat cgaggcggag 660 • 
atgatcgcgg ccgggtacgt gttcgagccg cccgcgcacg 'tct.caaccg't gcggctgcat 720 
gaaatcctgg ccggtttgtc tga-tgccaag ctggcggcct ggccggccag cttggccgct 780 
gaagaaaccg agcgccgccg tctaaaaagg tgatgtgtat -btgagtaaaa cagcttgcgt 840 
catgcggtcg ctgcg-tatat gat.gcgatga gt.aaa-taaac aaatacgcaa ggggaacgca 900 
'tgaagg-ttat cgctgtactt aaccagaaag gcgggtcagg caagacgacc atcgcaaccc 960 
atctagcccg cgccctgcaa ctcgccgggg crcgatgttct gttagtcgat tccgatcccc 1020 
agggcagtgc ccgcgattgg gcggccgtgc gggaagatca accgctaacc gttgtcggca 1080 
t-cgaccgccc gacga-ttgac cgcgacgtga aggccabcgg ccggcgcgac ttcgt-ag-bga 1140 
tcgacggagc gccccaggcg gcggacttgg ctgtgtccgc gat^caaggca gccgacli^cg 1200 
tgc-tgattcc ggtgcagcca agccct-tacg aca^a1:gggc caccgccgac ct.gg-tggagc 1260 
tggttaagca gcgcattgag gtcacggatig gaaggc^aca agcggccttt gtcgtgtcgc 1320 
gggcgatcaa aggcacgcgc atcggcggtg agglil^gccga ggcgctggcc gggiiacgagc 1380 
"tgcccattct tgagtcccgt at.cacgcagc gcgt.gagcta cccaggcact gccgccgccg 1440 
gcacaaccgt -bcbtgaalica gaacccgagg gcgacgctgc ccgcgaggtc caggcgctgg 1500 
ccgctgaaat 'taaa-tcaaaa ct-ca-tttgag ttaatgaggt aaagagaaaa t.gagcaaaag 1560 
cacaaacacg ctaagtgccg gccgt.ccgag cgcacgcagc agcaaggctg caacgttggc 1620 
cagcctggca gacacgccag ccatgaagcg ggtcaacttt cagttgccgg cggaggatca 1680 
caccaagctg aagatgtacg cgg-tacgcca aggcaagacc attaccgagc tgctatctga 1740 
atacatcgcg cagctaccag agtaaatgag caaatgaata aatgagtaga tgaattttag 18 00 
cggctaaagg aggcggcatg gaaaatcaag aacaaccagg caccgacgcc gtggaa-tgcc 1860 
ccatgtgtgg aggaacgggc ggttggccag gcgtaagcgg ctgggttgtc tgccggccct 1920 
gcaa-tggcac ^ggaaccccc aagcccgagg aatcggcgtg agcggtcgca aaccatccgg 1980 
cccggtacaa atcggcgcgg cgct^gggtga t.gacci:ggl:g gagaag-t-bga aggccgcgca 2040 
ggccgcccag cggcaacgca tcgaggcaga agcacgcccc ggtgaa^cgt^ ggcaagcggc 2100 
cgct.ga-tcga atccgcaaag aat:cccggca accgccggca gccggbgcgo cgtcga-ttag 2160 
gaagccgccc aagggcgacg agcaaccaga ttttttcgtt ccgatgctct atgacgtggg 22 20 
cacccgcgat agtcgcagca tcatggacgt ggccgttttc cgt,ctgtcga agcgtgaccg 2280 
acgagctggc gaggtgatcc gctacgagct tccagacggg cacgtagagg tttccgcagg 2340 
gccggccggc atggccagtg fegt-gggatta cgacctggtia ctga-tggcgg tttcccatct 2400 
aaccgaatcc atgaaccgat. accgggaagg gaagggagac aagcccggcc gcgtgttccg 24 60 
1:ccacacgtt gcggacgtac tcaagt-tctg ccggcgagcc gatggcggaa agcagaaaga 2520 
cgacctggta gaaacctigca ttcggttaaa caccacgcac gttgccatgc agcgt.acgaa 2580 
gaaggccaag aacggccgcc tggtgacggt atccgagggt gaagccttga ttagccgcta 2640 
caagatcgta aagagcgaaa ccgggcggcc ggag^aca-bc gaga-tcgagc tagctga-btg 2700 
gat.gt.accgc gagatcacag aaggcaagaa cccggacgtg c1igacggiit:c accccga-ttia 2760 
ctttttgatc ga1:cccggca tcggccgt-tt tctctaccgc ctggcacgcc gcgccgcagg 2820 
caaggcagaa gccagat.ggt tgttcaagac ga-tc^acgaa cgcagtggca gcgccggaga 2880 
g-tt-caagaag ttctg-tttca ccgtgcgcaa gctgatcggg tcaaatgacc tgccggag-ta 2940 
cgatttgaag gaggaggcgg ggcaggctgg cccgatccta gtcatgcgct accgcaacct 3000 
gatcgagggc gaagcatccg ccgg-ttccta atgtacggag cagatgctag ggcaaa-btgc 3060 
ccliagcaggg gaaaaaggtc gaaaaggtct ctttcctgtg gatagcacgt acattgggaa 3120 
cccaaagccg tacattggga accggaaccc gtacattggg aacccaaagc cgtacat-tgg 3180 
gaaccggtca cacat.gtaag tgactgatat aaaagagaaa aaaggcgatt tt-tccgcc-ta 3240 
aaactctt-ta aaac-btLa-t^a aaactcttaa aacccgcct:g gcctgtgcat aact.gt.c-tgg 3300 
ccagcgcaca gccgaagagc tgcaaaaagc gcctaccctt cggtcgctgc gcliccctacg 3360 
ccccgccgct tcgcgtcggc ctatcgcggc cgct:ggccgc t^caaaaatgg ctggcctacg 3420 
gccaggcaa^ ctaccagggc gcggacaagc cgcgccgtcg ccactcgacc gccggcgccc 3480 
acat.caaggc accctgcctc gcgcgtttcg gtgatigacgg tgaaaacctc tgacacatgc 354 0 
agctcccgga gacggtcaca gcttgtctgt aagcggatgc cgggagcaga caagcccgtc 3600 
agggcgcgtc agcgggtgtt ggcgggtgtc ggggcgcagc catgacccag tcacgtagcg 3660 
a^agcggagt gtatactggc t.taac-ta1:gc ggcat.cagag caga-fctg-tac tgagagtgca 3720 
ccat.atgcgg tgtgaaa1:ac cgcacaga-bg cg^aaggaga aaataccgca ^caggcgcbc 3780 
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ttccgctt.cc tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc 3840 
agctcactca aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa 3 900 
catgtgagca aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt 3960 
tttccatagg ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg 4020 
gcgaaacccg acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg 4080 
ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag 4140 
cgtggcgctt tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc 4200 
caagctgggc tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa 42 60 
ctatcgtctt gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg 4320 
taacaggatt agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc 4 380 
taactacggc tacactagaa ggacagtatt tggtatctgc gctctgctga agccagttac 4440 
cttcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg gtageggtgg 4500 
tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt 4560 
gatcttttct acggggtctg acgctcagtig gaacgaaaac tcacgttaag ggattttgg^ 4620 
catgcatgat atatctccca atttgtgtag ggcttattat gcacgcttaa aaataataaa 4680 
agcagacttg acctgatagt ttggctgtga gcaattatgt gcttagtgca tctaacgctt 4740 
gagttaagcc gcgccgcgaa gcggcgtcgg cttgaacgaa tttctagcta gacattattt 4 800 
gccgactacc ttggtgatct cgcctttcac gtagtggaca aattcttcca actgatctgc 4 860 
gcgcgaggcc aagcgatctt cttcttgtcc aagataagcc tgtctagctt caagtatgac 4920 
gggctgatac tgggccggca ggcgctccat tgcccagtcg gcagcgacat ccttcggcgc 4 980 
gattttgccg gttactgcgc tgtaccaaat gcgggacaac gtaagcacta catttcgctc 504 0 
atcgccagcc cagtcgggcg gcgagttcca tagcgttaag gtttcattta gcgcctcaaa 5100 
tagatcctgt tcaggaaccg gatcaaagag ttcctccgcc gctggaccta ccaaggcaac 5160 
gctatgttct cttgcttttg tcagcaagat agccagatca atgtcgatcg tggctggctc 5220 
gaagatacct gcaagaatgt cattgcgctg ccattctcca aattgcagtt cgcgcttagc 5280 
tggataacgc cacggaatga tgtcgtcgtg cacaacaatg gtgacttcta cagcgcggag 5340 
aatctcgctc tctccagggg aagccgaagt ttccaaaagg tcgttgatca aagctcgccg 54 00 
cgttgtttca tcaagcctta cggtcaccgt aaccagcaaa tcaatatcac tgtgtggctt 5460 
caggccgcca tccactgcgg agccgtacaa atgtacggcc agcaacgtcg gttcgagatg 5 520 
gcgctcgatg acgccaacta cctctgatag ttgagtcgat acttcggcga tcaccgcttc 5580 
ccccatgatg tttaactttg ttttagggcg actgccctgc tgcgtaacat cgttgctgct 5640 
ccataacatc aaacatcgac ccacggcgta acgcgcttgc tgcttggatg cccgaggcat 5700 
agactgtacc ccaaaaaaac agtcataaca agccatgaaa accgccactg cgttccatgg 57 60 
acatacaaat ggacgaacgg ataaaccttt tcacgccctt ttaaatatcc gattattcta 5820 
ataaacgctc ttttctctta ggtttacccg ccaatatatc ctgtcaaaca ctgatagttt 5880 
aaactgaagg cgggaaacga caatcagatc tagtaggaaa cagctatgac catgattacg 594 0 
ccaagcttgc atgcctgcag gtcgactcta gactagtgga tccgatatcg cccgggctcg 6000 
aggtacccat cgcgatcccc gtcaccggtg tgagggaact agttttgatc ttgaaagatc 6060 
ttttatcttt agagttaaga actctttcgt attttggtga ggttttatcc tcttgagttt 6120 
tggtcataga cctattcatg gctctgatac caatttttaa gcgggggctt atgcggatta 6180 
tttcttaaat tgataagggg ttattagggg gtatagggta taaatacaag cattccctta 6240 
gcgtatagta taagtatagt agcgtacctc tatcaaattt ccatcttctt accttgcaca 6300 
gggcctgcaa ccttatcctt ccttgtcttc ctccttcctt ccgtccactt catcatattt 6360 
aaaccaaacc tacgggggag tcaacgtaac caaccctgcc ttagcatctt ttccctaacg 64 20 
gcctcctgcc taagcggtac ttctagcttc gaacggcgtc tgggctccag gtttagtcgt 6480 
ctcgtgtctg gtttatattc acgacaaaga tctataggga ctttaggaga tctggatttt 6540 
agtactggat tttggtttta ggaattagaa attttattga tagaagtatt ttacaaatac 6600 
aaatacatac taagggtttc ttatatgctc aacacatgag cgaaacccta taagaaccct 6660 
aatttccctt atcgggaaac tactcacaca ttatttatgg agaaaataga gagagataga 6720 
tttgtagaga gagactggtg atttcagcgt accgaattcg attttcggct aacagtgtcg 6780 
aataacgctt tacaaacaat tattaacgcc cggttaccag gcgaagaggg gctgtggcag 6840 
attcatctgc aggacggaaa aatcagcgcc attgatgcgc aatccggcgt gatgcccata 6 900 
actgaaaaca gcctggatgc cgaacaaggt ttagttatac cgccgtttgt ggagccacat 6960 
attcacctgg acaccacgca aaccgccgga caaccgaact ggaatcagtc cggcacgctg 7020 
tttgaaggca ttgaacgctg ggccgagcgc aaagcgttat taacccatga cgatgtgaaa 7080 
caacgcgcat ggcaaacgct gaaatggcag attgccaacg gcattcagca tgtgcgtacc 7140 
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catgtcgatg tttcggatgc aacgct^aact: 
gaagtcgcgc cgtggattga tctgcaaatc 
gatccggtga tacctgcaca t.caacaaat.t: 
aaatatacac act^^a-taaac -tacagaaaag 
gaaaaaaal:a tli'tgaaatal: ta'tatt:act:a 
aaaggtagaa gcagaaac^t acgtag^cga 
gattt^gcaga t:caat:ccacg gcgcgactte 
agt-bagcgt.'b gca-tccgaaa cat.cgacat.g 
ctgccatttc agcgtit-tgcc at-gcgcgttg 
gcgct-cggcc cagcgt:1;caa t,gcctt:caaa 
tccggcggt-t tgcgtggtgt ccaggtgaat 
accttgttcg gcatccaggc tgt.t.tt.cagt 
ggcgctgat.t. -bt:t:ccgt:cc'b gcaga-tgaa-t 
ggcgt:taat.a att:gtt.tgt:a aagcg<ttat:t 
agg-bcgactc ^agaggat.cc ccgatccact 
cac-tcagaga ct-tgagagaa gcaat-at;at;a 
gt.t.'ba'bt.t'bc cba^tiggaga cgg^aacgaa 
l^gggactagc ccacgtggt-t tcttttctct 
gttcacatca aacct:act.aa taat;^gagaa 
agccgat.gc1: ttatiglic-kga atiatiaaaaaa 
ggtggtttac aaagtatttg gacacaataa 
-taccaaaliaa aaaacaaat.a aaaaact.t:aa 
atttattctt ttatggattc cttaaatcca 
gt,aat:aatat atactatgt.-t atiagat.atg'b 
gticgt.acga't ^ctacc^aat aat:at.t.gaac 
attttttttt tggtttgtct tagctgtatt 
cataaateaa at^tgacg^a tagatl^iil:!:!: 
a-bgt:gagttt ggc-bccta^t "taatc^tagg 
'b'ba'tct.caga gaaac-bggat. 'baaa^taaact. 
caaatatictt tggali'btact. at.-tat.t.ggcg 
cctacaagag aaa-tcaaaat 'k'ttt.gaa'b'ta 
a-ttagttcgc a-tcaagaaaa ccaaacccca 
t-titcafbgca gciit-bcagat aaaagaaaaa 
■tcttacttgc aggaactcaa attccacttt 
gt.caatgaaa cgaaat.ag'tc tttttatgca 
ctgattgtga cagccatcca tatatatagg 
-tatacgtaat gg^t.'tagcat agcttctatt 
-t-blLt-baagaa aaataaat:t:a atatatg^at: 
gtaacttaat tattaaacat tctaaa-tgca 
aactgaaatc aaagccagat t.cataa'taat: 
acaacatcca gaat.t.gaaca ct.caaa^t.gg 
gaa-b-tggttc 'bc'tti.cca't'tia -tt-aacgaaaa 
aaactcaaca taacattttg aacttgacbc 
acaaccgat-t acttgtttga gtcatcttcc 
ggt-ttcttcg ggt-gctcatt tacggtcaag 
accgaatcga agt^acaactt agctcttgct 
t-accgagctc gaa'tt.cac^g gccgtcgtt.t. 
aaaa-t-btaga acgaacbtaa tta-tga-bct^c 
-tagg-t-tatca ttat:gt.aaga aag-tttt^gac 
at.gi;aattgg tatcbcaact; caacat-ta^a 
aaacaactat tttttatgta tgcaagagtc 
tgacgagttc ggatgtagta gtagccatta 
atatgatgaa acat-tgtatc t-tattgtata 
ttctttcacg gtctgaat-ta attatgatac 
t.gaa1:t.gtat gaaat^ctaat -tgaacaagcc 
tgactcggtt taagtt.aacc actaaaaaaa 



gcgctgaaag caatgctgga agtgaagcag 7200 
gtcgccttcc c^caggaagg gattttgtcg 7260 
ttggtcatat attagaaaag ttataaatta 7320 
caa'ttgct.a't a-bact.aca'tt ctt-ttatttt 7380 
ctaattaatg at:aat.'batt:a 'ba-ta-ta-ta'tc 7440 
cgacaaaalic ccgtccbgag ggaaggcgac 7500 
ctgctucacl: tccagcattg ct-ttcagcgc 7560 
ggtacgcaca tgctgaatgc cgttggcaat 7620 
■tttcacatcg tcatgggt-ta ataacgcttt 7 680 
cagcgtgccg gactgattcc agttcggtt.g 774 0 
atg-tggctcc acaaacggcg gtataact.aa 7 800 
-tatgggcatc acgccggatt gcgca-tcaat 7860 
cbgccacagc ccctcttcgc ctggtaaccg 7920 
cgacactg^t: agccaagc^t gcat:gcctgc 7980 
cgagtctttg ttttttactt ^gg^teat:ga 8040 
gac-ttttttt tgtttttt-tt ttgtggtcac 8100 
gatcgaacct gt:ggt:ggaaa tigaaacmagg 8160 
gcattgattt gtttttgt-tt. tttytgtaaa 8220 
gaaaaatiaaa atctattgat -tgat-taaacc 82 80 
gaagtgaaaa ccccgtttaa gaattacaac 8340 
a^ccaaacga aa'taaaacaa aatiggagaac 84 00 
aagaalLt:t:at: tccatttttt ttcccgtaga 84 60 
tatttgatgc attttgattc ctcataatag 8520 
ttctaattcg -tatt-aacc^a cctttttttg 8580 
ggaatt^gatig -b't'bt.ggacca cttagaaag-b 8640 
t:cattaaBta taaatttaaa -taagaaabgl: 87 00 
aaatccattt tatgttat'tt aatatttgaa 8760 
at:gggttaal: ac1;aagt.tt.t: ccttaa'tgaa 8820 
aaaaaataga t.caa^g'tgt.t^ -ttiggtccgg-b 8880 
aaaagaaagt ctcatatagt aaatcat:at.t. 89 40 
aca-Lggat-t-g tat.agttt.ct: -tatLaliaacca 9000 
attaataatc aaacgggctt ggtaggaaiia 9060 
aacacacact caagtctttt atttcatctt 9120 
gccacttttc t.tit.acaaa'tia aacacaaa^tt. 9180 
aacactgttt gtcttttttc gatcacgttt 9240 
gaa^gtaaaa caacaaca'bg tgaaghcaca 9300 
t-tcgbbg-tca a'ta'bt:ag1:ca t.tccaaaaca 9360 
a-ttcttggaa c^aatgtatg t:ggaaataca 9420 
aa'ta-bgcaaa gaaaaaaaag aaaagaacac 9480 
•Lggctacatg gt.t.gtagaat gtagggtaac 9540 
at.ga'tagatg gataabcttt. aga-tacaaga 9600 
-baaagaaaaa aagt.t;tagca taaaagtttg 96 60 
ct-tcatiagga gt-gaca-tgaa ct.gacgaa'bc 9720 
gctttctcca ccttcgaaat gaatgtgacc 9780 
tgtaaaacat ctggtctcga gt:aatgtcca 9840 
acat.caccaa gatct-tgatg ggggatcggg 9900 
-tacaacgact cagcacgcgt tggli'ttcgac 9960 
aaatacattg at.aca1:at:ct ca-tct agate 10020 
gaatatggca cgacaaaatg gctagactcg 10080 
cttataccaa acattagtta gacaaaattt: 1014 0 
agcatatgta taattgattc agaatcgttl: 10200 
"tttaatgtac atactaa-tcg tgaatagtga 10260 
aat.at.cca'ta aacacat.cat gaaagacac-b 1032 0 
aattctaa-ta gaaaacgaat taaattacgt 10380 
aaccacgacg acgactaacg ttgcctggat 10440 
cggagctgtc at:gt.aacacg cggatcgagc 10500 
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agg^cacag-b ca-bgaagcca t.caaagcaaa agaact.aa-bc caagggctga ga-tgaiiiiaat: 10560 
t.agtt'taaaa attagt.t.aac acgagggaaa aggctgt.ct.g acagccaggt: cacgt.t:at.ct. 10620 
-btacctigtgg -tcgaaatgat tcgtgtctgt cgattt-taat tatttttttg aaaggccgaa 10680 
aaiiaaagt-tg t.aagagat.aa acccgcct;at. ataaattcat atattttcct ctccgctttg 10740 
aat-bgtc-tcg ttgtcctcct cactttcatc agccgttttg aatctccggc gacti-kgacag 10800 
agaagaacaa ggaagaagac -taagagagaa agtiaagaga-t aatccaggag attcattctc 10860 
cg-tt-ttgaat. cttcctcaat ctcatcttct tccgctcttt ctttccaagg baaliaggaac 10920 
tttc-tgga-tc tactttattt gctggatctc gatcttgttt tctcaatttc cttgagatct 10980 
ggaattcgtt. "taatttggat ctgt.gaacct ccactaaatc "tttt-ggt-ttt: actagaatcg 11040 
atctaagttg accgatcagt tagctcgatt atagctacca gaatttggct tgaccttgat 11100 
ggagagatcc atgttcatgt t.acct.gggaa atgatttgta -tatgtgaatt gaaat.ctgaa 11160 
c'tg-tt.gaagt. -bagattigaat. cbgaacac-tg t.caat.gttag at.t.gaa'bc'bg aacac'bgt.'t't 11220 
aagg^'tagat. gaag'bt'bg'kg tiatiaga-k-kcl: ^cgaaact-tt: agga^t;t.gta g'tgtcgt.acg 11280 
-btgaacagaa agctatttct gattcaa-tca gggt-ttatt-t gactgtattg aactcttttt X1340 
gtgtgt:t:tgc agctcataaa aaaaacgcga acctgcaggc atggcggcgg caacaacaac 114 00 
aacaacaaca tc-b^ct'tcga ^c-tcct.'kct:c caccaaacca 'bcbcc^-bccb cc-tccaaate 114 60 
accattacca atctccagat tctccctccc attctcccta aaccccaaca aatcatcctc 11520 
ctcctcccgc cgccgcggta tcaaat:ccag ctctccctcc tccatctccg ccgtgc-fccaa 11580 
cacaaccacc aatgtcacaa ccactccctc t^ccaaccaaa cctaccaaac ccgaaacatt. 11640 
catctcccga ttcgc-tccag atcaaccccg caaaggcgct gatatcctcg -tcgaagcttt 117 00 
agaacgt.caa ggcgtagaaa ccgtattcgc ttaccctgga ggtgcatcaa tggagattca 117 60 
ccaagcctta acccgctctt cctcaatccg taacgtcctt cctcgtcacg aacaaggagg 11820 
t.g'ta'ttcgca gcagaagga-b acgctcga-bc ctcagg-taaa ccagg-ta-tct. g-tabagccac 118 8 0 
ttcagg-tccc ggagctacaa a-bctcg-b-tag cggat.bagcc gatgcgttgt 'tagabagt.gt. 11940 
'tcctct.t.gba gcaa-tcacag gacaag^ccc -bcg'tcg-tabg ab'tgg'tacag abgcgb-ttca 12000 
agagacliccg alit.g't'tgagg -taacgcg't'tc gabbacgaag cat.aac-tat.c t.tigbga'tgga 12060 
-tg-ttgaagat a-tccctagga -t'tab'tgagga agcbt-'bc-b'tl: 'tt:agct.act.'b ctgg-tagacc 12120 
tggacctgtt ttggttgatg "ttcctaaaga t:at:tcaacaa cagct:t:gcga ttcctaattg 12180 
ggaacaggc-t atgagattac ctgg-tt.at.ai:. gtctaggatg cc-baaacct:.c cggaagat1:.c 12 24 BP- 
tcatttggag cagattg-tta ggttgatttc tgagtctaag aagcctgtgt tg-tatgttgg 123 00 
tggtggt-tgt ttgaattcta gcga-tgaatt gggtaggttt gttgagct.ta cggggatccc 123 60 
tg-ttgcgagt acgttgatgg ggctgggat-c t-tatccttg-t gatgatgagt -tg-tcgttaca 12 4 20 
ta-tgcttgga atgcatggga ctgtgtatgc aaattacgct gtggagcat-a gtgatttgtt 124 8 0 
gt-tggcgttt ggggtaaggt ttga-tgatcg t.gt.cacgggt aagctbgagg cttttgctag 1254 0 
t.agggct.aag att.g'bt.ca-ba -bbgabattga c-tcggcbgag att^gggaaga a-baagactcc 12600 
tcatgtgtct gtgtgtggtg atgttaagct ggctttgcaa ggga^gaa^a aggttcttga 12660 
gaaccgagcg gaggagcb-ba agc-t-tga-t-bt. -tggagb-b'tgg aggaabgagb -tgaacgl^aca 12720 
gaaacagaag -btbccg^bga gcb-t-baagac g-t-tbggggaa gcta-ttccbc cacagtialigc 12780 
ga-b-taaggbc cbbga-tgag-t -tgac'tga'tgg aaaagccat.a abaag-tacbg g-tg-tcgggca 12840 
acabcaaabg tgggcggcgc agttctacaa -tbacaagaaa ccaaggcagt ggc-ta-tca-tc 12900 
aggaggcctt ggagctatgg gatttggact tcctgctgcg attggagcgt ctgttgctaa 12960 
ccctgatgcg atagttgtgg ataftgacgg agabggaagc ttbataatga a-bgtgcaaga 13020 
gctagccact. attcgtgtag agaatcttcc agtgaaggta cbbtbab-taa acaaccagca 13080 
tcttggcatg gttatgcaat. gggaagabcg gt-tctacaaa gctaaccgag ctcacacatt 13140 
t-ctcggggat ccggctcagg aggacgagat attcccgaac atgttgctgt ttgcagcagc 13 200 
tt-gcgggatt ccagcggcga ggg-tgacaaa gaaagcagat cbccgagaag ctattcagac 13260 
aaiigctggat acaccaggac cttacctgtt ggatgtgatt tgbccgcacc aagaaca^gb 13320 
gbtgccgatig atcccgaat^g gtggcacttt caacgatgtc ataacggaag gaga^ggccg 13380 
ga'ttaaa'tac tgagagabga aaccggcct:g gccggcccgg agtggggagg cacgatggcc 13440 
gctttggtcg abcgacggga -tcgatcctgc tt-taatgaga -batgcgagac gcctatgatc 13500 
gcatgababb tgctttcaat -tctgttgbgc acgttgtaaa aaacctgagc a-tgtgtagct 13560 
cagatcctta ccgccggttt cggttcattc taatgaata-t atcacccgtt actatcgtat 13620 
ttttatgaat. aatattc-tcc gt-tcaattta ctgattgtac cctactactt at.abg^acaa 13680 
tattaaaatg aaaacaatat attgtgctga ataggtt.tat agcgacatct atgatagagc 13 74 0 
gccacaat-aa caaacaattg cgttttat.t.a -ttacaaatcc aatt-tbaaaa aaagcggcag 13800 
aaccggbcaa acctaaaaga cbgabtacat: aaabcttatt caaa-tttcaa aaggccccag 13860 
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gggctagtat ctacgacaca ccgagcggcg aactaataac gttcactgaa gggaactccg 13920 
gtitccccgcc ggcgcgca'tg ggtgagattc cttgaagttg agtattggcc gtccgctcta 13980 
ccgaaag^ta cgggcaccat. tcaacccggt. ccagcacggc ggccgggtaa ccgacttgct 14 04 0 
gccccgagaa 'tt.a-bgcagca tttttttggt gtatgtgggc cccaaatgaa gt:gcaggtca 14100 
aacctt;gaca gtgacgacaa a^cgttgggc ggg^ccaggg cgaatit.'tl^gc gacaaca'tgt: 14160 
cgaggctcag cagga^gggc ccag 14184 

<210> 59 
<211> 1011 
<212> DNA 
<213> Zea mays 

<220> 

<2 21> CDS 

<222> (1)»/(981) 

<223> coding for S-methyltiiioribose kinase 
<400> 59 

gca cga gca etc etc tec tct cct etc gee ggc gca teg cec gac tgt 48 
Ala Arg Ala Leu Leu Ser Ser Pro Leu Ala Gly Ala Ser Pro Asp Cys 
15 10 15 

cag tea gcc tea gee atg gee gcg gag gag gag cag ggc ttc cgc ccg 96 
Gin Ser Ala Ser Ala Met Ala Ala Glu 61u Glu Gin Gly Phe Arg Fro 
20 25 30 

ctg gac gag teg tec ctg etc gee tac ate aag gee aeg ccg gcg etc 14 4 
I»eu Asp Glu Ser Ser Leu Leu Ala Tyr lie Lys Ala Thr Pro Ala Leu 
35 40 45 

gcc tee cgc etc ggc ggc ggt ggc agt eta gac tee ate gag ate aag 192 
Ala Ser Arg Leu Gly Gly Gly Gly Ser Leu Asp Ser lie Glu lie Lys 
50 55 60 

gag gtc ggc gac ggc aac etc aac ttc gte tac ate gtg cag tec gag 240 
Glu Val Gly Asp Gly Asn Leu Asn Phe Val Tyr lie Val Gin Ser Glu 
65 70 75 80 

gcc ggc gee ate gte gtc aag cag gcg etc ccg tac gtg cgc tge gtg 288 
Ala Gly Ala lie Val Val Lys Gin Ala Leu Pro Tyr Val Arg Cys Val 
85 90 95 

ggg gat teg tgg cec atg aeg egg gag cgc gcc tac ttc gag gee tec 336 
Gly Asp Ser Trp Pro Met Thr Arg Glu Arg Ala Tyr Phe Glu Ala Ser 
100 105 110 

aeg ctg egg gag cac ggc cgc ctg tgc ccg gag cac acc cec gag gtg 384 
Thr Leu Arg Glu Bis Gly Arg Leu Cys Pro Glu His Thr Pro Glu Val 

115 120 125 

tac cac ttc gac egg ace ttg teg ctg atg ggg atg cgc tac ate gag 432 
Tyr His Phe Asp Arg Thr Leu Ser Leu Met Gly Met Arg Tyr lie Glu 
130 135 140 

cec ccg cac ate ate etc cgc aag ggc etc gtc gcc ggt gtc gag tac 4 80 
Pro Pro His lie lie Leu Arg Lys Gly Leu Val Ala Gly Val Glu Tyr 
145 150 155 160 

ccg ctg etc gcc gac cac atg tee gat tac atg gcc aag aeg etc ttc 528 
Pro Leu Leu Ala Asp His Met Ser Asp Tyr Met Ala Lys Thr I«eu Phe 
165 170 175 

ttc acc tec etc etc tat aac aat acc aeg gat cat aag aac gga gtt 576 
Phe Thr Ser Leu Leu Tyr Asn Asn Thr Thr Asp His Lys Asn Gly Val 
180 185 190 
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get. aag tac tct gcg aac gtg gag atg tgt agg etc aeg gag caa gtt 624 
Ala Lys Tyr Ser Ala Asn Val Glu Met Cys Arg Leu Thr Glu Gin Val 
195 200 205 

gtg ttc teg gac cca tac cgt gtt tec aaa ttt aat egg tgg acc teg 672 
Val Phe Ser Asp Pro Tyr Arg Val Ser Lys Phe Asn Arg Trp Thr Ser 
210 215 220 

cct tat etc gac aaa gat get gag gca gtt cgc gag gat gat gag etc 720 
Pro Tyr Leu Asp Lys Asp Ala Glu Ala Val Arg Glu Asp Asp Glu Leu 
225 230 235 240 

aag ttg gaa gta get ggg ctg aaa teg atg ttt ate gag aga get caa 768 
Lys Leu Glu Val Ala Gly Leu Lys Ser Met Phe lie Glu Arg Ala. Gin 
245 250 255 

get etg att eat gga gat etc cae act ggt tct ate atg gtg acc gaa 816 
Ala Leu He His Gly Asp Leu His Tlir Gly Ser He Met Val Thr Glu 
260 265 270 

gtt caa etc aag tea ttg ate cag aat ttg ggt tct atg ggg cca atg 864 
Val Gin Leu Lys Ser Leu He Gin Asn Leu Gly Ser Met Gly Pro Met 
275 280 285 

ggg ttt gat att ggg age ctt cct tgg aaa cct gat ttt ggg cat act 912 
Gly Phe Asp He Gly Ser Leu Pro Trp Lys Pro Asp Phe Gly His Thr 
290 295 300 

atg cac aga atg ggc atg ctg ate aag cga atg ate gta agg ctt aca 960 
Met His Arg Met Gly Met Leu He Lys Arg Met He Val Arg Leu Thr 
305 310 315 320 

aga atg gat ctt gaa gac aat tgaagagteg tggaatttgt tccacaaaaa 1011 
Arg Met Asp Leu Glu Asp Asn 325 

<210> 60 
<211> 327 
<212> PRT 
<213> Zea xoays 

<400> 60 

Ala Arg Ala Leu Leu Ser Ser Pro Leu Ala Gly Ala Ser Pro Asp Cys 
15 10 15 

Gin Ser Ala Ser iVla Met Ala Ala Glu Glu Glu Gin Gly Phe Arg Pro 
20 25 30 

Leu Asp Glu Ser Ser Leu Leu Ala Tyr He Lys Ala Thr Pro Ala Leu 
35 40 45 

Ala Ser Arg Leu Gly Gly Gly Gly Ser Leu Asp Ser He Glu He Lys 
50 55 60 

Glu Val Gly Asp Gly Asn Leu Asn Phe Val Tyr He Val Gin Ser Glu 
65 70 75 80 

Ala Gly Ala He Val Val Lys Gin Ala Leu Pro Tyr Val Arg Cys Val 
85 90 95 

Gly Asp Ser Trp Pro Met Thr Arg Glu Arg Ala Tyr Phe Glu Ala Ser 
100 105 110 

Thr Leu Arg Glu His Gly Arg Leu Cys Pro Glu His Thr Pro Glu Val 
115 120 125 
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Tyr His Phe Asp Arg Thr Leu Ser I*eu Met Gly Met Arg Tyr lie Glu 
130 135 140 

Pro Pro His lie lie Leu Arg I»ys Gly Leu Val Ala Gly Val Glu Tyr 
145 150 155 160 

Pro Leu Leu Ala Asp His Met Ser Asp Tyr Met Ala Lys Thr Leu Phe 
165 170 175 

Phe Thr Ser Leu Leu Tyr Asn Asn Thr Thr Asp His Lys Asn Gly Val 
180 185 190 

Ala Lys Tyr Ser Ala Asn Val Glu Met Cys Arg Leu Thr Glu Gin Val 

195 200 205 

Val Phe Ser Asp Pro Tyr Arg Val Ser Lys Phe Asn Arg Trp Thr Ser 
210 215 220 

Pro Tyr Leu Asp Lys Asp Ala Glu Ala Val Arg Glu Asp Asp Glu Leu 
225 230 235 240 

Lys Leu Glu Val Ala Gly Leu Lys Ser Met Phe lie Glu Arg Ala Gin 
245 250 255 

Ala Leu lie His Gly Asp Leu His Thr Gly Ser lie Met Val Thr Glu 
260 265 270 

Val Gin Leu Lys Ser Leu lie Gin Asn Leu Gly Ser Met Gly Pro Met 
275 280 285 

Gly Phe Asp lie Gly Ser Leu Pro Trp Lys Pro Asp Phe Gly His Thr 
290 295 300 

Met His Arg Met Gly Met Leu lie Lys Arg Met He Val Arg Leu Thr 
305 310 315 320 

Arg Met Asp Leu Glu Asp Asn 
325 



<210> 61 
<211> 471 
<212> DNA 

<213> Brassica napus 

<220> 

<221> CDS 

<222> (2). .(469) 

<223> coding for S-methylthiorlbose kinase 
<400> 61 

a ttt ccg ggt cga cga ttt cgt ggc aat etc aac ttc gtt ttc ate gtc 49 
Phe Pro Gly Arg Arg Phe Arg Gly Asn Leu Asn Phe Val Phe He Val 
15 10 15 

ate gga tec act ggc tea etc gtc ate aaa cag gcg ett ccg tat ata 97 
He Gly Ser Thr Gly Ser Leu Val He Lys Gin Ala l«eu Pro Tyr He 
20 25 30 

cgt tgt att ggg gag tct tgg oca atg acg aaa gaa aga get tae ttt 145 
Arg Cys He Gly Glu Ser Trp Pro Met Thr Lys Glu Arg Ala Tyr Phe 
35 40 45 

gaa get aea act ctg aga aag cac gga get ttg tct ect gat eat gtt 193 
Glu Ala Thr Thr Leu Arg Lys His Gly Ala Leu Ser Pro Asp His Val 
50 55 60 
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aga gca gta acc gag ttt tgt ggt aat gtg gag tta tgc egg tta acg 
Arg Ala Val Thr Glu Phe Cys Gly Asn Val Glu Leu Cys Arg Leu Thr 
130 135 140 

gag caa gta gtg ttc tct gac ccg tat aga gtt tct ag 
Glu Gin Val Val Phe Ser Asp Pro Tyr Arg Val Ser 
145 150 155 

<210> 62 
<211> 156 
<212> PRT 

<213> Brassica napus 
<400> 62 

Phe Pro Gly Arg Arg Phe Arg Gly Asn Leu Asn Phe Val Phe He Val 
15 10 15 

He Gly Ser Thr Gly Ser Leu Val He Lys Gin Ala Leu Pro Tyr He 
20 25 30 

Arg Cys He Gly Glu Ser Trp Pro Met Thr Lys Glu Arg Ala Tyr Phe 
35 40 45 

Glu Ala Thr Thr Leu Arg Lys His Gly Ala Leu Ser Pro Asp His Val 
50 55 60 

Pro Glu Val Tyr His Phe Asp Arg Thr Met Ala Leu He Gly Met Arg 
65 70 75 80 

Tyr Leu Glu Pro Pro His He He Leu Arg Lys Gly Leu Val Ala Gly 
85 90 95 

He Gin Tyr Pro Phe Leu Ala Glu His Met Ala Asp Tyr Met Ala Lys 
100 105 110 

Thr Leu Phe Phe Thr Ser Leu Leu Tyr His Asp Thr Thr Glu His Lys 
115 120 125 

Arg Ala Val Thr Glu Phe Cys Gly Asn Val Glu Leu Cys Arg Leu Thr 
130 135 140 

Glu Gin Val Val Phe Ser Asp Pro Tyr Arg Val Ser 
145 150 155 



241 



289 
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cct gaa gtc tac cat ttt gac agg acc atg get ttg att gga atg agg 
Pro Glu Val Tyr His Phe Asp Arg Thr Met Ala Leu He Gly Met Arg 
65 70 75 80 

tat ctg gag cct cct cac ate ate etc cgc aaa gga etc gtt get gga 
Tyr I-eu Glu Pro Pro His He He Leu Arg Lys Gly Leu Val Ala Gly 
85 90 95 

ate cag tac cct ttc ctt gca gaa cac atg get gat tac atg gcc aaa 337 
He Gin Tyr Pro Phe Leu Ala Glu His Met Ala Asp Tyr Met Ala Lys 
100 105 110 

acc etc ttc ttc act teg etc etc tat eat gat acc aca gag cac aaa 385 
Thr Leu Phe Phe Thr Ser Leu Leu Tyr His Asp Thr Thr Glu His Lys 
115 120 125 



433 



471 



<210> 63 
<211> 415 
<212> DNA 

<213> Brassica napus 
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<220> 

<221> CDS 

<222> (3) (413> 

<223> coding for 5-met.hylt:hioribose kinase 
<400> 63 

gg gtc gac gat ttc gtg ctg aga gca aaa gag at.g teg ttc gat gag 47 
Val Asp Asp Phe Val Leu Arg Ala Lys GIu Met Ser Phe Asp Glu 
1 5 10 15 

ttc aag ccg ttg aac gag aaa tct eta gta gag tae ata aag gea acg 95 
Phe liys Pro Leu Asn Glu Lys Ser Leu Val Glu Tyr lie Lys Ala Thr 
20 25 30 

act gcc etc tec tee agg etc gga gac aag tac gat gat ctg gtc ate 143 
Pro Ala Leu Ser Ser Arg Leu Gly Asp Lys Tyr Asp Asp Leu Val lie 
35 40 45 

aag gaa gtt gga gat gge aat etc aac ttc gtt. ttc ate gtt gtc gga 191 
Lys Glu Val Gly Asp Gly Asn Leu Asn Phe Val Phe lie Val Val Gly 
50 55 60 

tec act gge tea etc gtc ate aaa cag gcg ctt ccg tat ata cgt tgt 239 
Ser Thr Gly Ser Leu Val lie Lys Gin Ala Leu Pro Tyr lie Arg Cys 
65 70 75 

att gga gaa tea tgg cca atg acg aaa gaa aga get tac ttt gaa gca 287 
lie Gly Glu Ser Trp Pro Met Thr Lys Glu Arg Ala Tyr Phe Glu Ala 
80 85 90 95 

aca act ctg aga aag cac ggt ggt ttg tct ccg gat cat gtt cet gaa 335 
Thr Thr Leu Arg Lys His Gly Gly Leu Ser Pro Asp His Val Pro Glu 
100 105 110 

gtc tac cat ttt gac aga acc atg get ttg att gga atg aga tac etc 383 
Val Tyr His Phe Asp Arg Thr Met Ala Leu lie Gly Met Arg Tyr Leu 
115 120 125 

gag cct cct cac ate ate etc cgc aaa gga ct 415 
Glu Pro Pro His lie lie Leu Arg Lys Gly 
130 135 

<210> 64 
<211> 137 
<212> PRT 

<213> Brassica napus 
<400> 64 

Val Asp Asp Phe Val Leu Arg Ala Lye Glu Met Ser Phe Asp Glu Phe 
1 5 10 15 

Lys Pro Leu Asn Glu Lys Ser Leu Val Glu Tyr lie Lys Ala Thr Pro 
20 25 30 

Ala Leu Ser Ser Arg Leu Gly Asp Lys Tyr Asp Asp Leu Val Xle Lys 
35 40 45 

Glu Val Gly Asp Gly Asn Leu Asn Phe Val Phe lie Val Val Gly Ser 
50 55 60 

Thr Gly Ser Leu Val lie Lys Gin Ala Leu Pro Tyr He Arg Cys lie 
65 70 75 80 



Gly Glu Ser Trp Pro Met Thr Lys Glu Arg Ala Tyr Phe Glu Ala Thr 
85 90 95 
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Thr Leu Arg Lys His Gly Gly lieu Ser Pro Asp His Val Pro Glu Val 
100 105 110 

Tyr His Phe Asp Arg Thr Met Ala Leu lie Gly Met Arg Tyr Leu Glu 
115 120 125 

Pro Pro His lie lie Leu Arg Lys Gly 
130 135 

<210> 65 
<211> 424 
<212> DNA 

<213> Oryza sativa 

<220> 

<221> CDS 

<222> (3).. (422) 

<2 23> coding for 5*znethyltliioribose kinase 
<400> 65 

cc ctt etc tac aac tec acc act gat cac aag aaa gga gtt get cag 47 
Leu Leu Tyr Asn Ser Thr Thr Asp His Lys Lys Gly Val Ala Gin 
15 10 15 

tac tgc gat aat gtg gag atg tgt agg etc aca gag caa gtc gtg ttc 95 
Tyr Cys Asp Asn Val Glu Met Cys Arg Leu Thr Glu Gin Val Val Phe 
20 25 30 

tea gac cca tac atg etc gee aaa tac aat cgt tgc aca tea ccc ttc 14 3 
Ser Asp Pro Tyr Met Leu Ala Lys Tyr Asn Arg Cys Thr Ser Pro Phe 
35 40 45 

eta gat aat gat get gca geg gtt cga gag gat get gag ctt aaa ttg 191 
Leu Asp Asn Asp Ala Ala Ala Val Arg Glu Asp Ala Glu Leu Lys Leu 
50 55 60 

gag att get gaa ttg aaa tea atg ttt att gag aga gca cag get ctt 239 
Glu Xle Ala Glu Leu Lys Ser Met Phe lie Glu Arg Ala Gin Ala Leu 
65 70 75 

ctt cat gga gat etc cac act ggt tec ate atg gtg aca cca gat tet 287 
Leu His Gly Asp Leu His Thr Gly Ser He Met Val Thr Pro Asp Ser 
80 85 90 95 

act caa gtg att gat cca gaa ttt get ttc tat ggc cca atg ggt tac 335 
Thr Gin Val He Asp Pro Glu Phe Ala Phe Tyr Gly Pro Met Gly Tyr 
100 105 110 

gac att ggg gee ttc ctg ggg aac ttg att ttg gca tat ttt tea caa 383 
Asp He Gly Ala Phe Leu Gly Asn Leu He Leu Ala Tyr Phe Ser Gin 
115 120 125 

gat gga cac get gat caa gca aat gat cgt aag get tac aa 424 
Asp Gly His Ala Asp Gin Ala Asn Asp Arg Lys Ala Tyr 
130 135 140 

<210> 66 
<211> 140 
<212> PRT 

<213> Oryza sativa 
<400> 66 

Leu Leu Tyr Asn Ser Thr Thr Asp His Lys Lys Gly Val Ala Gin Tyr 
15 10 15 
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Cys Asp Ash Val Glu Met Cys Arg Leu Thr Glu Gin Val Val Phe Ser 
20 25 30 

Asp Pro Tyr Met l#eu Ala Lys Tyr Asn Arg Cys Thr Ser Pro Phe I^eu 
35 40 45 

Asp Asn Asp Ala Ala Ala Val Arg Glu Asp Ala Glu Leu Lys Leu Glu 
50 55 60 

lie Ala Glu Leu Lye Ser Met Phe lie Glu Arg Ala Gin Ala Leu Leu 
65 70 75 80 

His Gly Asp Leu His Thr Gly Ser lie Met Val Thr Pro Asp Ser Thr 
85 90 95 

Gin Val lie Asp Pro Glu Phe Ala Phe Tyr Gly Pro Met Gly Tyr Asp 
100 105 110 

lie Gly Ala Phe Leu Gly Asn Leu lie Leu Ala Tyr Phe Ser Gin Asp 
115 120 125 

Gly His Ala Asp Gin Ala Asn Asp Arg Lys Ala Tyr 
130 135 140 

<210> 67 

<211> 404 

<212> DNA 

<213> Glycine max 

<220> 

<221> CDS 

<222> (3). .(404) 

<223> coding for S-methylthioribose kinase 
<400> 67 

ta ate ccc gaa cat gtt cct gaa gtg tat cac ttt gac cgt acc atg 4 7 
Zle Pro Glu His Val Pro Glu Val Tyr His Phe Asp Arg Thr Met 
15 10 15 

tct ttg ate ggt atg cgt tac ttg gag ccc cca cat ata ate etc ata 95 
Ser Leu lie Gly Met Arg Tyr Leu Glu Pro Fro His lie lie Leu lie 
20 25 30 

aaa ggg ttg att get ggg att gag tac cct ttt ttg get gaa cac atg 143 
Lys Gly Leu Zle Ala Gly He Glu Tyr Pro Phe Leu Ala Glu His Met 
35 40 45 

get gat ttc atg gcg aag aca etc ttc ttc acg tct ctg ctt ttc cgt 191 
Ala Asp Phe Met Ala Lys Thr Leu Phe Phe Thr Ser Leu Leu Phe Arg 
50 55 60 

tec act get gac cac aaa egg gac gtt gee gaa ttt tgt ggg aat gtg 239 
Ser Thr Ala Asp His Lys Arg Asp Val Ala Glu ?he Cys Gly Asn Val 
65 70 75 

gag tta tgc agg etc act gaa eag gtc gtt ttc tct gac cct tat aaa 2 87 
Glu Leu Cys Arg Leu Thr Glu Gin Val Val Phe Ser Asp Pro Tyr Lys 
80 85 90 95 

gtt tct eaa tat aat cgt tgg act tec ccc tat ctt gat cgt gat get 335 
Val Ser Gin Tyr Asn Arg Trp Thr Ser Pro Tyr Leu Asp Arg Asp Ala 
100 105 110 

gag get gtt egg gaa gac aat ctg ctg aag ctt gaa gtt get gag ctg 383 
Glu Ala Val Arg Glu Asp Asn Leu Leu Lys Leu Glu Val Ala Glu Leu 
115 120 125 
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aaa -tec aag t.tc at:t: gag age 4D4 
I>ys Ser Lys Phe lie 61u Ser 
130 

<210> 68 

<211> 134 

<212> PRT 

<213> Glycine max 

<400> 68 

Zle Pro Glu His Val Pro Glu Val Tyr Hie Phe Asp Arg Thr Hel: Ser 
15 10 15 

I.eu Tie Gly Met: Arg Tyr lieu Glu Pro Pro His lie lie Leu lie Lys 
20 25 30 

Gly lieu lie Ala Gly Zle Glu Tyr Pro Phe Leu Ala Glu His Met, Ala 
35 40 45 

Asp Fhe Me-t Ala Lys Thr- Leu Phe Phe Thr Ser Leu Leu Phe Arg Ser 
50 55 60 

Thr Ala Asp His Lys Arg Asp Val Ala Glu Phe Cys Gly Asn Val Glu 
65 70 75 80 

Leu Cys Arg Leu Thr Glu Gin Val Val Phe Ser Asp Pro Tyr Lys Val 
85 90 95 

Ser Gin Tyr Asn Arg Trp Thr Ser Pro Tyr Leu Asp Arg Asp Ala Glu 
100 105 110 

Ala Val Arg Glu Asp Asn lieu Leu Lys Leu Glu Val Ala Glu Leu Lys 
115 120 125 

Ser Lys Phe Jle Glu Ser 
130 



<210> 69 
<211> 21 
<212> DHA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: 
oligonucleotide primer 

<400> 69 

cgtgaatacg gcgtggagtc g 21 

<210> 70 
<211> 20 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Description of the artificial sequence: 
oligonucleotide primer 

<400> 70 

cggcaggata atcaggttgg 20 

<210> 71 
<211> 20 
<212> DNA 

<213> Artificial sequence 
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<220> 

<223> Description of the artificial sequence: 
oligonucleotide primer 

<400> 71 

gtcaacgtaa ccaaccctgc 
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We claim: 



5 



1. A process for preparing transformed plant cells or organisms, 
which comprises the following steps : 



a> transforming a population of plant cells, with the cells 
of said population containing at least one marker protein 
capable of causing directly or indirectly a toxic effect 
for said population, with at least one nucleic acid se- 
quence to be inserted in combination with at least one 
double-stranded marker protein ribonucleic acid sequence 
or an expression cassette or expression cassettes ensur- 
ing expression thereof capable of reducing the expression 
of at least one marker protein, and 



b> selecting transformed plant cells whose genome contains 

said nucleic acid sequence and which have a growth advan- 

20 tage over nontrans formed cells, due to the action of said 

double-stranded marker protein ribonucleic acid sequence, 
from said population of plant cells, the selection being 
carried out under conditions under which the marker pro- 
tein can exert its toxic effect on the nontrans formed 

25 cells. 



2. The process as claimed in claim 1, wherein the marker protein 
is capable of converting directly or indirectly a substance X 
which is nontoxic for said population of plant cells into a 
30 substance Y which is toxic for said population, which process 

comprises the following steps: 



a) transforming the population of plant cells with at least 
one nucleic acid sequence to be inserted in combination 
with at least one double -stranded marker protein ribonu- 
cleic acid sequence or an expression cassette or expres- 
sion cassettes ensuring expression thereof capable of re- 
ducing the expression of at least one marker protein, and 

40 

b) treating said population of plant cells with the sub- 
stance X at a concentration which causes a toxic effect 
for nontrans formed cells ^ due to the conversion by the 
marker protein, and 

45 

c) selecting transformed plant cells whose genome contains 
said nucleic acid sequence and which have a growth advan- 
tage over nontrans formed cells, due to the action of said 
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double-stranded marker protein ribonucleic acid sequence , 
from said population of plant cells ^ the selection being 
carried out under conditions under which the marker pro- 
tein can exert its toxic effect on the nontrans formed 
cells. 

The process as claimed in claim 2, wherein the nontoxic sub- 
stance X is a substance which does not naturally occur in 
plant cells or organisms or occurs naturally therein only at 
a concentration which can essentially not cause any toxic ef- 
fect • 

The process as claimed in claim 2 or 3, wherein the substance 
X is a substance selected from the group consisting of pro- 
herbicides^ proantibiotics, nucleoside analogs , 5-f luorocyto- 
siner auxinamide compounds, naphthalacetamide , dihaloalkanes , 
Acyclovir, Ganciclovir, 1 , 2-deoxy-2-f luoro-b-D-arabinof urano- 
sil-5-iodouracil , 6-thioxanthine , allopurinol , 6-methylpur ine 
deoxyribonucleoside, 4-aminopyrazolopyrimidine, 2-amino-4-me- 
thoxybutanoic acid, 5-( trif luoromethyl)thioribose and allyl 
alcohol . 

5. The process as claimed in any of claims 1 to 4, wherein the 

marker protein is selected from the group consisting of cyto- 
sine deaminases, cytochrome P-450 enzymes, indoleacetic acid 
hydrolases, haloalkane dehalogenases , thymidine kinases, gua- 
nine phosphoribosyl transferases, hypoxanthine phosphoribosyl 
transferases, xanthine guanine phosphoribosyl transferases, 
purine nucleoside phosphorylases , phosphonate monoester hy- 
drolases, indoleacetamide synthases, indoleacetamide hydro- 
lases, adenine phosphoribosyl transferases, methoxinine dehy- 
drogenases, rhizobitoxin synthases, 5-methylthioribose 
kinases and alcohool dehydrogenases. 

6. The process as claimed in any of claims 1 to 5, wherein the 
marker protein is encoded by 

a) a sequence described by the GenBank accession number 
S56903, M32238, NC003308, AE009419, AB016260, NC002147, 
M26950, J02224, V00470, V00467, U10247, M13422, X00221, 
M60917, U44852, M61151, AF039169, AB025110, AF212863, 
AC079674, X77943, M12196, AF172282, X04049 or AF253472 

b) a sequence according to SEQ ID NO: 2, 4, 6, 8, 10, 12, 
14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 
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42, 44, 46 or 48* 

7. The process as claimed in any of claims 1 to 6, wherein a se- 
^ quence coding for a resistance to at lesat one toxin, anti- 
biotic or herbicide is introduced together with the nucleic 
acid sequence to be inserted and selection is carried out ad- 
ditionally using the toxin, antibiotic or herbicide. 

8. The process as claimed in any of claims 1 to 7, wherein the 
nucleic acid sequence to be inserted into the genome of the 
plant cell or of the plant organism comprises at least one 
expression cassette capable of expressing, under the control 
of a promoter functional in plant cells or in plant organ- 

3^5 isms, an RNA and/or a protein which does not cause the ex- 

pression, amount, activity and/or function of a marker pro- 
tein to be reduced. 



20 



9. The process as claimed in any of claims 1 to 8, wherein the 
plant cell is part of a plant organism or of a tissue, part, 
organ, cell culture or propagation material derived there- 
from. 

10. The process as claimed in any of claims 1 to 9 for preparing 
transformed plant cells or organisms, which comprises the 
following steps : 

a) transforming a population of plant cells which comprises 
30 at least one non-endogenous (preferably non-plant) marker 

protein capable of converting directly or indirectly a 
substance X which is nontoxic for said population of 
plant cells into a substance Y which is toxic for said 
population, with at least one nucleic acid sequence to be 
35 inserted in combination with at least one nucleic acid 

sequence coding for a double-stranded marker protein ri- 
bonucleic acid sequence or an expression cassette or ex- 
pression cassettes ensuring expression thereof ribonucle- 
ic acid sequence capable of reducing the expression, 
40 amount, activity and/or function of said marker protein, 

and 



b) treating said population of plant cells with the sub- 
^5 stance X at a concentration which causes a toxic effect 

for nontransf ormed cells, due to the conversion by the 
marker protein, and 
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c) selecting transformed plant cells (and/or populations of 
plant cells, such as plant tissues or plants) whose ge- 
nome contains said nucleic acid sequence and which have a 
growth advantage over nontrans formed cells, due to the 
action of said double-stranded marker protein ribonucleic 
acid sequence, from said population of plant cells, the 
selection being carried out under conditions under which 
the marker protein can exert its toxic effect on the non- 
transformed cells, and 

d) regenerating fertile plants, and 

e) eliminating by crossing the nucleic acid sequence coding 
for the marker protein and isolating fertile plants whose 
genome contains said nucleic acid sequence but does not 
contain any longer the sequence coding for the marker 
protein. 

11. An amino acid sequence coding for a plant 5-methylthioribose 
kinase, wherein said amino acid sequence contains at least 
one sequence selected from the group consisting of S£Q ID liO: 
60, 62, 64, 66 or 68. 

12. A nucleic acid sequence coding for a plant 5-methylthioribose 
kinase, wherein said nucleic acid sequence contains at least 
one sequence selected from the group consisting of SEQ ID NO: 
59, 61, 63, 65 or 67. 

13. A double -stranded RNA molecule, comprising 

a) a "sense" RNA strand comprising at least one ribonucleo- 
tide sequence which is essentially identical to at least 
a part of the "sense" RNA transcript of a nucleic acid 
sequence coding for a marker protein, and 

b) an "antisense" RNA strand which is essentially, prefer- 
ably fully, complementary to the RNA sense strand under 
a). 

14. The double-stranded RNA molecule as claimed in claim 13,, 
wherein the marker protein is defined as in any of claims 2 
to 6. 

15. The double-stranded RNA molecule as claimed in either of 
claims 13 and 14, wherein the "sense" RNA strand and the "an- 
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tisense'' RNA strand are covalently linked to one another in 
^ the form of an inverted repeat. 

16. A transgenic expression cassette, comprising a nucleic acid 
sequence which codes for a double-stranded RNA molecule as 
claimed in any of claims 13 to 15 and which is functionally 
linked to a promoter functional in plant organisms . 

17. A transgenic vector , comprising a transgenic expression cas- 
sette as claimed in claim 16. 

18- A transgenic plant organism, comprising a double-stranded RNA 
molecule as claimed in any of claims 13 to 15, a transgenic 
expression cassette as claimed in claim 16 or a transgenic 
vector as claimed in claim 17 • 

19 • The transgenic plant organism as claimed in claim 18, se- 
lected from the group of plants, consisting of wheat, oats, 
millet, barley, rye, corn, rice, buckwheat, sorghum, triti- 
cale, spelt, linseed, sugar cane, oilseed rape, cress, arabi- 
dopsis, cabbage species, soybean, alfalfa, pea, bean plants, 
peanut, potato, tobacco, tomato, eggplant, paprika, sunflow- 
er, tagetes, lettuce, calendula, melon, pumpkin and zucchini - 

20. A tissue, an organ, a part, a cell, a cell culture or propa- 
gation material, derived from a transgenic plant organism as 
claimed in either of claims 18 and 19. 
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Klebsiella pneumoniae 
Clostridium tetani . 
Zea mays 
A. thai! ana 
Brassica napus-2 
Soy-1 

Oryza sativa-1 
Consensus 



Klebsiella pneumoniae 
Clostridium tetani . 
Zea mays 
A. tbaliana 
Brassica napus -2 
Soy -1 

Oryza sativa -1 
Consensus 



Klebsiella pneumoniae 
Clostridium tetani. 
Zea mays 
A. thalieuia 
Brassica napus -2 
Soy -1 
sativa -1 
Consensus 



Klebsiella pneumoniae 
Clostridium tetani, 
Zea mays 
A. thaliana 
Brassica napus -2 
Soy -1 

Oryza sativa -1 
Consensus 



Klebsiella pneumoniae 
Clostridium tetani. 
Zea mays 
A. thaliana 
Brassica napus -2 
Soy -1 

Oryza sativa -1 
Consensus 



1 50 

(1) MSQYHTFTAHDAVAYAQQ 

(1) MSRFDSHFRMETEDAILYAKE 

( 1 ) ARALLSSPLAGASPDCQSASAMAAEEEQGFRPLDESSLLAYIKATPALAS 

(1) MSFEEFTPIiNEKSIiVDY IKSTPALS S 

( 1 ) VDDFVLRAKEMSFDEFKPLNEKSLVEYIKATPALSS 

(1) 

(1) 

(1) L V A 

51 100 
( 19 ) FAGIDNPSELVSAQEVGDGNIiNIiVFKVFDRQGVSRAIVKQALPYVRCVGE 
( 22 ) KLGIFDEHAKLQAEEIGDGNI^^YVFKVWDVNTKKSVIIKHADIFLRSSGR 
( 51 ) RLGGGGSLDSIEIKEVGDGNLNFVYIVQSEAGA- - IWKQALPYVRCVGD 
( 27 ) KIGADKSDDDXiVIKEVGDGNLNFVFIWGSSGS- -LVIKQALPYIRCIGE 
( 37 ) RLGDKY — DDLVIKEVGDGNLNFVF I WGSTGS- -LVIKQALPYIRCIGE 

<1) 

(1) 

(51) KLG 



D L 



EVGDGNLNFVF V 



LVIKQALPYIRCIGE 



101 150 
{69) SWPLTLDRARLEAQTLVAHYQHSPQHTVKIHHFDPELAVMVMEDLS -DHR 
< 72 ) — ELDVDRNRIEAEVLMLQGILAPGLVPKVYKYDSVMCNLSMEDIS-DHR 
(99) SWPMTRERAYFEASTLREHGRLCPEHTPEVYHFDRTLSLMGMRYIEPPHI 
(75) SWPMTKERAYFEATTLRKHGNLSPDHVPEVYHFDRTMALIGMRYLEPPHI 
( 83 ) SWPMTKERAYFEATTLRKHGGLSPDHVPEVYHFDRTMALIGMRYLEPPHI 

(1) IPEHVPEVYHFDRTMSLIGMRYLEPPHI 

(1> 

(101) SWPMT ERA EA TL HG LSPDHVPEVYHFDRTMALIGMRYLEPPHI 

151 200 

(118) IWRGELI ANVYYPQAARQLGDYLAQVLFHTSDFYLHPHEKKAQVAQFIN- 

(119) NLRKELLKRNTFPSFAEHITTFIVDTLLPTTDLVMDSGEKKDNVKKYIN- 
(149) ILRKGLVAGVEYPLLADHMSDYMAKTLFFTSLLYNirrTDHKNGVAKYSAN 
(125) ILRKGL I AGI EYPFLADHMSDYMAKTLFFTSLLYHDTTEHRRAVTEFCGN 

(133) ILRKG 

(29) ILIKGLI AGIEYPFLAEHMADFMAKTLFFTSLLFRSTADHKRDVAEFCGN 

LLYNSTTDHKKGVAQYCDN 

ILRKGLIA I YP ADHM DYMA TLF TSLLY T DHK VA F N 



(1) 
(151) 
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Zea mays 
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